summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Implement strip.invariant.groupPiotr Padlewski2018-07-023-0/+3
| | | | | | | | | | | | | | | | Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073
* [DAGCombiner] Handle correctly non-splat power of 2 -1 divisor (PR37119)Simon Pilgrim2018-06-301-7/+9
| | | | | | | | | | The combine added in commit 329525 overlooked the case where one, but not all, of the divisor elements is -1, -1 is the only power of two value for which the sdiv expansion recipe breaks. Thanks to @zvi for the original patch. Differential Revision: https://reviews.llvm.org/D45806 llvm-svn: 336048
* [MachineOutliner] Add support for target-default outlining.Jessica Paquette2018-06-302-6/+36
| | | | | | | | | | | | | | | | | This adds functionality to the outliner that allows targets to specify certain functions that should be outlined from by default. If a target supports default outlining, then it specifies that in its TargetOptions. In the case that it does, and the user hasn't specified that they *never* want to outline, the outliner will be added to the pass pipeline and will run on those default functions. This is a preliminary patch for turning the outliner on by default under -Oz for AArch64. https://reviews.llvm.org/D48776 llvm-svn: 336040
* [MachineOutliner] Add always and never options to -enable-machine-outlinerJessica Paquette2018-06-291-4/+11
| | | | | | | | | | | | | | | | | This is a recommit of r335887, which was erroneously committed earlier. To enable the MachineOutliner by default on AArch64, we need to be able to disable the MachineOutliner and also provide an option to "always" enable the outliner. This adds that capability. It allows the user to still use the old -enable-machine-outliner option, which defaults to "always". This is building up to allowing the user to specify "always" versus the target default outlining behaviour. https://reviews.llvm.org/D48682 llvm-svn: 335986
* [DEBUG_INFO, NVPTX] Do not emit .debug_loc section.Alexey Bataev2018-06-292-1/+12
| | | | | | | | | | | | | | | Summary: .debug_loc section is not supported for NVPTX target. If there is an object whose location can change during its lifetime, we do not generate debug location info for this variable. Reviewers: echristo Subscribers: jholewinski, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D48730 llvm-svn: 335976
* [MachineOutliner] Never add the outliner in -O0Jessica Paquette2018-06-281-1/+1
| | | | | | | | | | | | This is a recommit of r335879. We shouldn't add the outliner when compiling at -O0 even if -enable-machine-outliner is passed in. This makes sure that we don't add it in this case. This also removes -O0 from the outliner DWARF test. llvm-svn: 335930
* [COFF] Fix constant sharing regression for MinGWMartin Storsjo2018-06-281-1/+4
| | | | | | | | | | This fixes a regression since SVN r334523, where the object files built targeting MinGW were rejected by GNU binutils tools. Prior to that commit, we only put constants in comdat for MSVC configurations. Differential Revision: https://reviews.llvm.org/D48567 llvm-svn: 335918
* [MachineOutliner] Define MachineOutliner support in TargetOptionsJessica Paquette2018-06-282-10/+2
| | | | | | | | | | | | | | | Targets should be able to define whether or not they support the outliner without the outliner being added to the pass pipeline. Before this, the outliner pass would be added, and ask the target whether or not it supports the outliner. After this, it's possible to query the target in TargetPassConfig, before the outliner pass is created. This ensures that passing -enable-machine-outliner will not modify the pass pipeline of any target that does not support it. https://reviews.llvm.org/D48683 llvm-svn: 335887
* [DAGCombiner] Ensure we use the correct CC result type in visitSDIV (REAPPLIED)Simon Pilgrim2018-06-281-5/+6
| | | | | | | | | | We could get away with it for constant folded cases, but not for rL335719. Thanks to Krzysztof Parzyszek for noticing. Reapply original commit rL335821 which was reverted at rL335871 due to a WebAssembly bug that was fixed at rL335884. llvm-svn: 335886
* Revert "[MachineOutliner] Add always and never options to ↵Jessica Paquette2018-06-281-12/+4
| | | | | | | | | -enable-machine-outliner" I accidentally committed this instead of D48683 because I haven't had coffee yet. llvm-svn: 335883
* Revert "[MachineOutliner] Never add the outliner in -O0"Jessica Paquette2018-06-281-2/+1
| | | | | | | | | | | This reverts commit 9c7c10e4073a0bc6a759ce5cd33afbac74930091. It relies on r335872 since that introduces the machine outliner flags test. I meant to commit D48683 in that commit, but got mixed up and committed D48682 instead. So, I'm reverting this and r335872, since D48682 hasn't made it through review yet. llvm-svn: 335882
* [MachineOutliner] Never add the outliner in -O0Jessica Paquette2018-06-281-1/+2
| | | | | | | | | | | We shouldn't add the outliner when compiling at -O0 even if -enable-machine-outliner is passed in. This makes sure that we don't add it in this case. This also updates machine-outliner-flags to reflect the change and improves the comment describing what that test does. llvm-svn: 335879
* SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O)Matthias Braun2018-06-281-3/+17
| | | | | | | | | | | | | Add NoTrapAfterNoreturn target option which skips emission of traps behind noreturn calls even if TrapUnreachable is enabled. Enable the feature on Mach-O to save code size; Comments suggest it is not possible to enable it for the other users of TrapUnreachable. rdar://41530228 DifferentialRevision: https://reviews.llvm.org/D48674 llvm-svn: 335877
* [MachineOutliner] Add always and never options to -enable-machine-outlinerJessica Paquette2018-06-281-4/+12
| | | | | | | | | | | | | To enable the MachineOutliner by default on AArch64, we need to be able to disable the MachineOutliner and also provide an option to "always" enable the outliner. This adds that capability. It allows the user to still use the old -enable-machine-outliner option, which defaults to "always". This is building up to allowing the user to specify "always" versus the target-default outlining behaviour. llvm-svn: 335872
* Revert "[DAGCombiner] Ensure we use the correct CC result type in visitSDIV"Haojian Wu2018-06-281-6/+5
| | | | | | | | This reverts commit r335821. This crashes the webassembly test, run "ninja check-llvm-codegen-webassembly" to reproduce. llvm-svn: 335871
* Revert "Add support for generating a call graph profile from Branch ↵Benjamin Kramer2018-06-281-47/+7
| | | | | | | | Frequency Info." This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost. llvm-svn: 335851
* [DAGCombiner] Ensure we use the correct CC result type in visitSDIVSimon Pilgrim2018-06-281-5/+6
| | | | | | | | We could get away with it for constant folded cases, but not for rL335719. Thanks to Krzysztof Parzyszek for noticing. llvm-svn: 335821
* [DAGCombiner] Remove unused variable. NFCI.Simon Pilgrim2018-06-281-2/+0
| | | | | | Noticed in D45806 review. llvm-svn: 335817
* [DwarfDebug] Remove unused argument (NFC)Petar Jovanovic2018-06-281-3/+2
| | | | | | | | | | Remove unused ByteStreamer argument from function emitDebugLocValue. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D48590 llvm-svn: 335811
* Add support for generating a call graph profile from Branch Frequency Info.Michael J. Spencer2018-06-271-7/+47
| | | | | | | | | | | | | | | | | | | | | === Generating the CG Profile === The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: ``` !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} ``` Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335794
* [DAGCombine] Disable TokenFactor simplifications when optnone.Nirav Dave2018-06-271-0/+4
| | | | llvm-svn: 335773
* [globalisel][legalizer] Add AtomicOrdering to LegalityQuery and use it in ↵Daniel Sanders2018-06-272-6/+12
| | | | | | | | | | | | | AArch64 Now that we have the ability to legalize based on MMO's. Add support for legalizing based on AtomicOrdering and use it to correct the legalization of the atomic instructions. Also extend all() to be a variadic template as this ruleset now requires 3 and 4 argument versions. llvm-svn: 335767
* [DAGCombiner] restrict (float)((int) f) --> ftrunc with no-signed-zerosSanjay Patel2018-06-271-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As noted in the D44909 review, the transform from (fptosi+sitofp) to ftrunc can produce -0.0 where the original code does not: #include <stdio.h> int main(int argc) { float x; x = -0.8 * argc; printf("%f\n", (float)((int)x)); return 0; } $ clang -O0 -mavx fp.c ; ./a.out 0.000000 $ clang -O1 -mavx fp.c ; ./a.out -0.000000 Ideally, we'd use IR/node flags to predicate the transform, but the IR parser doesn't currently allow fast-math-flags on the cast instructions. So for now, just use the function attribute that corresponds to clang's "-fno-signed-zeros" option. Differential Revision: https://reviews.llvm.org/D48085 llvm-svn: 335761
* [MachineOutliner] Don't outline sequences where x16/x17/nzcv are live acrossJessica Paquette2018-06-271-1/+0
| | | | | | | | | | | | | | It isn't safe to outline sequences of instructions where x16/x17/nzcv live across the sequence. This teaches the outliner to check whether or not a specific canidate has x16/x17/nzcv live across it and discard the candidate in the case that that is true. https://bugs.llvm.org/show_bug.cgi?id=37573 https://reviews.llvm.org/D47655 llvm-svn: 335758
* [DAGCombiner] visitSDIV - add special case handling for (sdiv X, 1) -> X in ↵Simon Pilgrim2018-06-271-11/+7
| | | | | | | | pow2 expansion For divisor = 1, perform a select of X - reduces scalarisation of simple SDIVs llvm-svn: 335727
* [DAGCombiner] visitSDIV - simplify pow2 handling. NFCI.Simon Pilgrim2018-06-271-29/+12
| | | | | | Use the builtin constant folding of getNode() etc. instead of doing it manually. llvm-svn: 335720
* [DAGCombiner] Fold SDIV(%X, MIN_SIGNED) -> SELECT(%X == MIN_SIGNED, 1, 0)Simon Pilgrim2018-06-271-0/+5
| | | | | | Fixes PR37569. llvm-svn: 335719
* [DAGCombiner] Don't accept signbit sdiv divisors in sdiv-by-pow2 vector ↵Simon Pilgrim2018-06-271-0/+2
| | | | | | expansion (PR37569) llvm-svn: 335717
* [DAGCombiner] use isBitwiseNot to simplify code; NFCSanjay Patel2018-06-261-8/+3
| | | | llvm-svn: 335652
* [DAGCombiner] Don't accept -1 sdiv divisors in sdiv-by-pow2 vector expansion ↵Simon Pilgrim2018-06-261-0/+2
| | | | | | | | (PR37119) Temporary fix until I've managed to get D45806 updated - both +1 and -1 special cases need to be properly supported. llvm-svn: 335637
* [DAGCombiner] Pull out VT bitwidth in visitSDIV. NFCI.Simon Pilgrim2018-06-261-4/+4
| | | | llvm-svn: 335617
* Silence "unused variable" warning in LiveIntervals.cpp after r335607Krzysztof Parzyszek2018-06-261-0/+1
| | | | llvm-svn: 335610
* Account for undef values from predecessors in extendSegmentsToUsesKrzysztof Parzyszek2018-06-263-16/+72
| | | | | | | | It is legal for a PHI node not to have a live value in a predecessor as long as the end of the predecessor is jointly dominated by an undef value. llvm-svn: 335607
* [SelectionDAG] Remove debug locations from ConstantSD(FP)NodesVedant Kumar2018-06-251-2/+2
| | | | | | | | | | | | | | | | | | This removes debug locations from ConstantSDNode and ConstantSDFPNode. When this kind of node is materialized we no longer create a line table entry which jumps back to the constant's first point of use. This makes single-stepping behavior smoother, and it matches the model used by IR, where Constants have no locations. See this thread for more context: http://lists.llvm.org/pipermail/llvm-dev/2018-June/124164.html I'd like to handle constant BuildVectorSDNodes and to try to eliminate passing SDLocs to SelectionDAG::getConstant*() in follow-up commits. Differential Revision: https://reviews.llvm.org/D48468 llvm-svn: 335497
* StackSlotColoring: Decide colors per stack IDMatt Arsenault2018-06-251-22/+50
| | | | | | | | | | | | | | | | I thought I fixed this in r308673, but that fix was very broken. The assumption that any frame index can be used in place of another was more widespread than I realized. Even when stack slot sharing was disabled, this was still replacing frame index uses with a different ID with a different stack slot. Really fix this by doing the coloring per-stack ID, so all of the coloring logically done in a separate namespace. This is a lot simpler than trying to figure out how to change the color if the stack ID is different. llvm-svn: 335488
* Improve handling of COPY instructions with identical value numbersKrzysztof Parzyszek2018-06-251-28/+126
| | | | | | | | Testcases provided by Tim Renouf. Differential Revision: https://reviews.llvm.org/D48102 llvm-svn: 335472
* Revert change 335077 "[InlineSpiller] Fix a crash due to lack of forward ↵Artur Pilipenko2018-06-251-26/+0
| | | | | | | | | | progress from remat specifically for STATEPOINT" This change caused widespread assertion failures in our downstream testing: lib/CodeGen/LiveInterval.cpp:409: bool llvm::LiveRange::overlapsFrom(const llvm::LiveRange&, llvm::LiveRange::const_iterator) const: Assertion `!empty() && "empty range"' failed. llvm-svn: 335462
* Fix -Wparentheses gcc warning. NFCI.Simon Pilgrim2018-06-251-1/+1
| | | | llvm-svn: 335451
* [DAGCombiner] eliminate setcc bool math when input is low-bit of some valueSanjay Patel2018-06-241-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch has the same motivating example as D48466: define void @foo(i64 %x, i32 %c.0282.in, i32 %d.0280, i32* %ptr0, i32* %ptr1) { %c.0282 = and i32 %c.0282.in, 268435455 %a16 = lshr i64 32508, %x %a17 = and i64 %a16, 1 %tobool = icmp eq i64 %a17, 0 %. = select i1 %tobool, i32 1, i32 2 %.286 = select i1 %tobool, i32 27, i32 26 %shr97 = lshr i32 %c.0282, %. %shl98 = shl i32 %c.0282.in, %.286 %or99 = or i32 %shr97, %shl98 %shr100 = lshr i32 %d.0280, %. %shl101 = shl i32 %d.0280, %.286 %or102 = or i32 %shr100, %shl101 store i32 %or99, i32* %ptr0 store i32 %or102, i32* %ptr1 ret void } ...but I'm trying to kill the setcc bool math sooner rather than later. By matching a larger pattern that includes both the low-bit mask and the trailing add/sub, we can create a universally good fold because we always eliminate the condition code intermediate value. Here are Alive proofs for these (currently instcombine folds the 'add' variants, but misses the 'sub' patterns): https://rise4fun.com/Alive/Gsyp Name: sub of zext cmp mask %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %z = zext i1 %c to i32 %r = sub i32 C1, %z => %optional_cast = zext i8 %a to i32 %r = add i32 %optional_cast, C1-1 Name: add of zext cmp mask %a = and i32 %x, 1 %c = icmp eq i32 %a, 0 %z = zext i1 %c to i8 %r = add i8 %z, C1 => %optional_cast = trunc i32 %a to i8 %r = sub i8 C1+1, %optional_cast All of the tests look like improvements or neutral to me. But it is possible that x86 test+set+bitop is better than what we now show here. I suspect we could do better by adding another fold for the 'sub' variants. We start with select-of-constant in IR in the larger motivating test, so that's why I included tests with selects. Proofs for those variants: https://rise4fun.com/Alive/Bx1 Name: true const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C2, i64 C1 => %z = zext i8 %a to i64 %r = sub i64 C2, %z Name: false const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C1, i64 C2 => %z = zext i8 %a to i64 %r = add i64 C1, %z Differential Revision: https://reviews.llvm.org/D48466 llvm-svn: 335433
* Initialize LiveRegs once in BranchFolder::mergeCommonTailsKrzysztof Parzyszek2018-06-221-1/+2
| | | | llvm-svn: 335365
* Recommit r335333 "[MC] - Add .stack_size sections into groups and link them ↵George Rimar2018-06-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | with .text" With compilation fix. Original commit message: D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335336
* Revert r335332 "[MC] - Add .stack_size sections into groups and link them ↵George Rimar2018-06-221-2/+1
| | | | | | | | | | | | with .text" It broke bots. http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443 http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551 llvm-svn: 335333
* [MC] - Add .stack_size sections into groups and link them with .textGeorge Rimar2018-06-221-1/+2
| | | | | | | | | | | | | | | | | D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335332
* Revert r335306 (and r335314) - the Call Graph Profile pass.Chandler Carruth2018-06-221-47/+7
| | | | | | | | | | | This is the first pass in the main pipeline to use the legacy PM's ability to run function analyses "on demand". Unfortunately, it turns out there are bugs in that somewhat-hacky approach. At the very least, it leaks memory and doesn't support -debug-pass=Structure. Unclear if there are larger issues or not, but this should get the sanitizer bots back to green by fixing the memory leaks. llvm-svn: 335320
* [Instrumentation] Add Call Graph Profile passMichael J. Spencer2018-06-211-7/+47
| | | | | | | | | | | | | | | | | | | | This patch adds support for generating a call graph profile from Branch Frequency Info. The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335306
* [X86] Fix 32-bit mingw comdat names, only add one underscoreReid Kleckner2018-06-211-11/+6
| | | | llvm-svn: 335304
* [mingw] Fix GCC ABI compatibility for comdat thingsReid Kleckner2018-06-211-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: GCC and the binutils COFF linker do comdats differently from MSVC. If we want to be ABI compatible, we have to do what they do, which is to emit unique section names like ".text$_Z3foov" instead of short section names like ".text". Otherwise, the binutils linker gets confused and reports multiple definition errors when two object files from GCC and Clang containing the same inline function are linked together. The best description of the issue is probably at https://github.com/Alexpux/MINGW-packages/issues/1677, we don't seem to have a good one in our tracker. I fixed up the .pdata and .xdata sections needed everywhere other than 32-bit x86. GCC doesn't use associative comdats for those, it appears to rely on the section name. Reviewers: smeenai, compnerd, mstorsjo, martell, mati865 Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D48402 llvm-svn: 335286
* [DebugInfo] Ignore DBG_VALUE instructions in PostRA Machine SinkMatt Davis2018-06-211-25/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The logic for handling the sinking of COPY instructions was generating different code when building with debug flags. The original code did not take into consideration debug instructions. This resulted in the registers in the DBG_VALUE instructions being treated as used, and prevented the COPY from being sunk. This patch avoids analyzing debug instructions when trying to sink COPY instructions. This patch also creates a routine from the code in MachineSinking::SinkInstruction to perform the logic of sinking an instruction along with its debug instructions. This functionality is used in multiple places, including the code for sinking COPY instrs. Reviewers: junbuml, javed.absar, MatzeB, bjope Reviewed By: bjope Subscribers: aprantl, probinson, thegameg, jonpa, bjope, vsk, kristof.beyls, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45637 llvm-svn: 335264
* DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1"Stanislav Mekhanoshin2018-06-211-3/+14
| | | | | | | | | | | | | | | Allowed folding for "and/or" binops with non-constant operand if arguments of select are 0/-1 values. Normally this code with "and" opcode does not get to a DAG combiner and simplified yet in the InstCombine. However AMDGPU produces it during lowering and InstCombine has no chance to optimize it out. In turn the same pattern with "or" opcode can reach DAG. Differential Revision: https://reviews.llvm.org/D48301 llvm-svn: 335250
* [CodeGen] Avoid handling DBG_VALUE in LiveRegUnits::stepBackwardKrzysztof Parzyszek2018-06-211-2/+2
| | | | | | | | Patch by Jesper Antonsson. Differential Revision: https://reviews.llvm.org/D48420 llvm-svn: 335233
OpenPOWER on IntegriCloud