summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [Instruction] Add hasMetadata(Kind) helper [NFC]Philip Reames2019-09-046-10/+10
| | | | | | It's a common idiom, so let's add the obvious wrapper for metadata kinds which are basically booleans. llvm-svn: 370933
* AMDGPU: Handle frame index expansion with no free SGPRs pre gfx9Matt Arsenault2019-09-042-26/+58
| | | | | | | | | | | | | | Since an add instruction must produce an unused carry out, this requires additional SGPRs. This can be avoided by keeping the entire offset computation in SGPRs. If one SGPR is still available, this only costs one extra mov. If none are available, the entire computation can be done in place and reversed. This does assume the use is a VGPR operand. This was already assumed, and we currently only select frame indexes to VALU instructions. This should probably be fixed at some point to handle more possible MIR. llvm-svn: 370929
* GlobalISel: Add G_BITREVERSEMatt Arsenault2019-09-041-0/+2
| | | | | | This is the first failing pattern for AMDGPU and is trivial to handle. llvm-svn: 370927
* [Attributor] Look at internal functions only on-demandJohannes Doerfert2019-09-041-55/+92
| | | | | | | | | | | | | | | | | | | Summary: Instead of building attributes for internal functions which we do not update as long as we assume they are dead, we now do not create attributes until we assume the internal function to be live. This improves the number of required iterations, as well as the number of required updates, in real code. On our tests, the results are mixed. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66914 llvm-svn: 370924
* [Attributor] Use the white list for attributes consistentlyJohannes Doerfert2019-09-041-60/+32
| | | | | | | | | | | | | | | | | | | | Summary: We create attributes on-demand so we need to check the white list on-demand. This also unifies the location at which we create, initialize, and eventually invalidate new abstract attributes. The tests show mixed results, a few more call site attributes are determined which can cause more iterations. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66913 llvm-svn: 370922
* AMDGPU/GlobalISel: Make 16-bit constants legalMatt Arsenault2019-09-041-11/+5
| | | | | | This is mostly for the benefit of patterns which use 16-bit constants. llvm-svn: 370921
* [Attributor] Deal more explicit with non-exact definitionsJohannes Doerfert2019-09-041-74/+31
| | | | | | | | | | | | | | | | | | | Summary: Before we tried to rule out non-exact definitions early but that lead to on-demand attributes created for them anyway. As a consequence we needed to look at the definition in the initialize of each attribute again. This patch centralized this lookup and tightens the condition under which we give up on non-exact definitions. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67115 llvm-svn: 370917
* [X86] Add support for avx512bf16 for __builtin_cpu_supports and ↵Craig Topper2019-09-041-2/+7
| | | | | | compiler-rt's cpu indicator. llvm-svn: 370915
* [Hexagon] Improve generated code for test-if-bit-clear, one more timeKrzysztof Parzyszek2019-09-041-18/+29
| | | | | | Adjust isel patterns after recent commit. Fixes https://llvm.org/PR43194. llvm-svn: 370913
* [InstSimplify] guard against unreachable code (PR43218)Sanjay Patel2019-09-041-1/+6
| | | | | | | This would crash: https://bugs.llvm.org/show_bug.cgi?id=43218 llvm-svn: 370911
* [Debuginfo][SROA] Need to handle dbg.value in SROA pass.Alexey Lapshin2019-09-041-2/+7
| | | | | | | | | | | | SROA pass processes debug info incorrecly if applied twice. Specifically, after SROA works first time, instcombine converts dbg.declare intrinsics into dbg.value. Inlining creates new opportunities for SROA, so it is called again. This time it does not handle correctly previously inserted dbg.value intrinsics. Differential Revision: https://reviews.llvm.org/D64595 llvm-svn: 370906
* [ModuloSchedule] Fix no-asserts buildJames Molloy2019-09-041-5/+8
| | | | | | Apologies, due to a git SNAFU this fix (dump doesn't exist and silence unused variables) stayed in my index rather than applying to rL370893. llvm-svn: 370894
* [ModuloSchedule] Introduce PeelingModuloScheduleExpanderJames Molloy2019-09-042-6/+478
| | | | | | | | | | | | | | | This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and then peeling out the prolog and epilogs. This patch implements kernel generation as well as a validator that will confirm the number of phis added is the same as the ModuloScheduleExpander. Prolog and epilog peeling will come in a different patch. Differential Revision: https://reviews.llvm.org/D67081 llvm-svn: 370893
* [InstCombine] Fold sub (or A, B) (and A, B) to (xor A, B)David Bolvansky2019-09-041-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: ``` Name: sub or and to xor %or = or i32 %y, %x %and = and i32 %x, %y %sub = sub i32 %or, %and => %sub = xor i32 %x, %y Optimization: sub or and to xor Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/eJu Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67153 llvm-svn: 370883
* [DebugInfo] LiveDebugValues: locations with different exprs should not be mergedJeremy Morse2019-09-041-7/+17
| | | | | | | | | | | | | | | | | When comparing variable locations, LiveDebugValues currently considers only the machine location, ignoring any DIExpression applied to it. This is a problem because that DIExpression can do pretty much anything to the machine location, for example dereferencing it. This patch adds DIExpressions to that comparison; now variables based on the same register/memory-location but with different expressions will compare differently, and be dropped if we attempt to merge them between blocks. This reduces variable coverage-range a little, but only because we were producing broken locations. Differential Revision: https://reviews.llvm.org/D66942 llvm-svn: 370877
* [LiveDebugValues][NFC] Silence an unused variable warningJeremy Morse2019-09-041-0/+1
| | | | | | | | On release builds, 'MI' isn't used by anything (it's already inserted into a block by BuildMI), while on non-release builds it's used by a LLVM_DEBUG statement. Mark as explicitly used to avoid the warning. llvm-svn: 370870
* DWARF: Fix a regression in location list dumpingPavel Labath2019-09-042-26/+22
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: While fixing the handling of some error cases, r370363 introduced new problems -- assertion failures due to unchecked errors (my excuse is that a very early version of that patch used Optional<T> instead of Expected). This patch adds proper handling of parsing errors encountered when dumping location lists from inside DWARF DIEs, and adds a bunch of additional tests. I reorder the arguments of the location list dumping functions to make them consistent, and also be able to dump the two kinds of location lists generically. Reviewers: JDevlieghere, dblaikie, probinson Subscribers: aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67102 llvm-svn: 370868
* [yaml2obj] Support PT_GNU_STACK and PT_GNU_RELROFangrui Song2019-09-041-0/+2
| | | | | | | | | | | | | PT_GNU_STACK is used in an llvm-objcopy test. I plan to use PT_GNU_RELRO in a patch to improve nested segment processing in llvm-objcopy (PR42963). Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D67146 llvm-svn: 370857
* [ARM][ParallelDSP] SExt mul for accumulationSam Parker2019-09-041-5/+14
| | | | | | | | | | For any unpaired muls, we accumulate them as an input to the reduction. Check the type of the mul and perform a sext if the existing accumlator input type is not the same. Differential Revision: https://reviews.llvm.org/D66993 llvm-svn: 370851
* [IRPrinting] Improve module pass printer to work better with -filter-print-funcsTaewook Oh2019-09-041-5/+13
| | | | | | | | | | | | | | Summary: Previously module pass printer pass prints the banner even when the module doesn't include any function provided with `-filter-print-funcs` option. This introduced a lot of noise, especailly with ThinLTO. This diff addresses the issue and makes the banner printed only when the module includes functions in `-filter-print-funcs` list. Reviewers: fedor.sergeev Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66560 llvm-svn: 370849
* [GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type ↵Amara Emerson2019-09-041-3/+7
| | | | | | | | | | | combination. Similar to the issue with G_ZEXT that was fixed earlier, this is a quick to fall back if the source type is not exactly half of the dest type. Fixes the clang-cmake-aarch64-lld bot build. llvm-svn: 370847
* [RISCV] Enable tail call opt for variadic functionJim Lin2019-09-041-5/+0
| | | | | | | | | | | | | | | | Summary: Tail call opt can treat variadic function call the same as normal function call Reviewers: mgrang, asb, lenary, lewis-revill Reviewed By: lenary Subscribers: luismarques, pzheng, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66278 llvm-svn: 370835
* [MemorySSA] Move two verify calls under expensive checks.Alina Sbirlea2019-09-041-2/+2
| | | | llvm-svn: 370831
* Revert [Windows] Disable TrapUnreachable for Win64, add SEH_NoReturnReid Kleckner2019-09-037-37/+10
| | | | | | | | | | | | | | | | | | This reverts r370525 (git commit 0bb1630685fba255fa93def92603f064c2ffd203) Also reverts r370543 (git commit 185ddc08eed6542781040b8499ef7ad15c8ae9f4) The approach I took only works for functions marked `noreturn`. In general, a call that is not known to be noreturn may be followed by unreachable for other reasons. For example, there could be multiple call sites to a function that throws sometimes, and at some call sites, it is known to always throw, so it is followed by unreachable. We need to insert an `int3` in these cases to pacify the Windows unwinder. I think this probably deserves its own standalone, Win64-only fixup pass that runs after block placement. Implementing that will take some time, so let's revert to TrapUnreachable in the mean time. llvm-svn: 370829
* [WebAssembly] Compare functions by names in Emscripten SjljHeejin Ahn2019-09-031-64/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This removes all string constants for function names and compares functions by string directly when needed. Many of these constants are used only once or twice so the benefit of defining them separately is not very clear, and this actually fixes a bug. When we already have a `malloc` declaration which is an alias to something else within the module, ``` @malloc = weak hidden alias i8* (i32), i8* (i32)* @dlmalloc ``` (this happens compiling with emscripten with `-s WASM_OBJECT_FILES=0` because all bc files are merged before being fed into `wasm-ld` which runs the backend optimizations as LTO) `Module::getFunction("malloc")` in `canLongjmp` returns `nullptr` because `Module::getFunction` dyncasts pointer into `Function`, but the alias is a `GlobalValue` but not a `Function`. This makes `canLongjmp` return false for `malloc` in this case, and we end up adding a lot of longjmp handling code around malloc. This is not only a code size increase but actually a bug because `malloc` is used in the entry block when preparing for setjmp tables for emscripten sjlj handling, and this makes initial setjmp preparation, which has to happen in the entry block, move to another split block, and this interferes with SSA update later. This also adds two more functions, `getTempRet0` and `setTempRet0`, in the list of not longjmp-able functions. Fixes https://github.com/emscripten-core/emscripten/issues/8935. Reviewers: sbc100 Subscribers: mehdi_amini, jgravelle-google, hiraditya, sunfish, dexonsmith, dschuff, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67129 llvm-svn: 370828
* [InstrProf] Tighten a check for malformed data records in raw profilesVedant Kumar2019-09-031-4/+10
| | | | | | | | | | | | | The check needs to validate a counter offset before performing pointer arithmetic with the (potentially corrupt) offset. Found by UBSan's pointer overflow check. rdar://54843625 Differential Revision: https://reviews.llvm.org/D66979 llvm-svn: 370826
* [GVN] Remove a todo introduced w/rL370791Philip Reames2019-09-031-3/+0
| | | | | | | | | | When I dug into this, it turns out to be *much* more involved than I'd realized and doesn't actually simplify anything. The general purpose of the leader table is that we want to find the most-dominating definition quickly. The problem for equivalance folding is slightly different; we want to find the most dominating *value* whose definition block dominates our use quickly. To make this change, we'd end up having to restructure the leader table (either the sorting thereof, or maybe even introducing multiple leader tables per value) and that complexity is just not worth it. llvm-svn: 370824
* [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.Amara Emerson2019-09-032-4/+23
| | | | | | | | | Now that we have the infrastructure to support s128 types as parameters we can expand these to libcalls. Differential Revision: https://reviews.llvm.org/D66185 llvm-svn: 370823
* [GlobalISel][CallLowering] Add support for splitting types according to ↵Amara Emerson2019-09-038-55/+179
| | | | | | | | | | | | | | calling conventions. On AArch64, s128 types have to be split into s64 GPRs when passed as arguments. This change adds the generic support in call lowering for dealing with multiple registers, for incoming and outgoing args. Support for splitting for return types not yet implemented. Differential Revision: https://reviews.llvm.org/D66180 llvm-svn: 370822
* [MemorySSA] Disable MemorySSA use.Alina Sbirlea2019-09-032-5/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D58311 llvm-svn: 370821
* [Attributor] Use the delete API for livenessJohannes Doerfert2019-09-031-5/+16
| | | | | | | | | | | | Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66833 llvm-svn: 370818
* [Attributor] Deduce "no-capture" argument attributeJohannes Doerfert2019-09-031-10/+355
| | | | | | | | | | | | | | Add the no-capture argument attribute deduction to the Attributor fixpoint framework. The new string attributed "no-capture-maybe-returned" is introduced to allow deduction of no-capture through functions that "capture" an argument but only by "returning" it. It is only used by the Attributor for testing. Differential Revision: https://reviews.llvm.org/D59922 llvm-svn: 370817
* [CodeGen] Use FSHR in DAGTypeLegalizer::ExpandIntRes_MULFIXBjorn Pettersson2019-09-031-49/+19
| | | | | | | | | | | | | | | | | | | | | | | Summary: Simplify the right shift of the intermediate result (given in four parts) by using funnel shift. There are some impact on lit tests, but that seems to be related to register allocation differences due to how FSHR is expanded on X86 (giving a slightly different operand order for the OR operations compared to the old code). Reviewers: leonardchan, RKSimon, spatel, lebedev.ri Reviewed By: RKSimon Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, pzheng, bevinh, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67036 llvm-svn: 370813
* [MemorySSA] Re-enable MemorySSA use.Alina Sbirlea2019-09-032-1/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D58311 llvm-svn: 370811
* [MC] Pass through .code16/32/64 and .syntax unified for COFFReid Kleckner2019-09-032-11/+13
| | | | | | | | | | | | | | These flags should simply be passed through to the target, which will do the right thing. Add an MC/X86 test that uses these directives with the three primary object file formats and shows that they disassemble the same everywhere. There is a missing test for .code32 on Windows ARM, since I'm not sure exactly how to construct one. Fixes PR43203 llvm-svn: 370805
* [GVN] Propagate simple equalities from assumes within the tail of the blockPhilip Reames2019-09-031-19/+74
| | | | | | | | | | | | This extends the existing logic for propagating constant expressions in an analogous manner for what we do across basic blocks. The core point is that we chose some order of operands, and canonicalize uses towards that one. The heuristic used is inspired by the one used across blocks; in a follow up change, I'd plan to common them so that the cross block version uses the slightly stronger ordering herein. As noted by the TODOs in the code, there's a good amount of room for improving the existing code and making it more powerful. Some follow up work planned. Differential Revision: https://reviews.llvm.org/D66977 llvm-svn: 370791
* [AArch64][GlobalISel] Don't import i64imm_32bit pattern at -O0Jessica Paquette2019-09-031-0/+11
| | | | | | | | | | | | | This pattern, when imported at -O0 adds an extra copy via the SUBREG_TO_REG. This is because the SUBREG_TO_REG is not eliminated. At all other opt levels, it is eliminated. This is a 1% geomean code size savings at -O0 on CTMark. Differential Revision: https://reviews.llvm.org/D67027 llvm-svn: 370789
* Revert r370454 "[LoopIdiomRecognize] BCmp loop idiom recognition"Roman Lebedev2019-09-031-869/+8
| | | | | | | | | | https://bugs.llvm.org/show_bug.cgi?id=43206 was filed, claiming that there is a miscompilation. Reverting until i investigate. This reverts commit r370454 llvm-svn: 370788
* [SVE][Inline-Asm] Fix -Wimplicit-fallthrough in AArch64ISelLowering.cppKerry McLaughlin2019-09-031-0/+1
| | | | | | | | | | | | | | | | Summary: Adds break to 'x' case in getRegForInlineAsmConstraint added by D66302, fixing the unintentional fallthrough. Reviewers: sdesmalen, rovka, cameron.mcinally, greened, gribozavr, ruiu Reviewed By: sdesmalen Subscribers: bjope, javed.absar, tschuett, kristof.beyls, rkruppe, psnobl, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67095 llvm-svn: 370769
* [X86] Merge 2 consecutive HasInt256 branches. NFCI.Simon Pilgrim2019-09-031-3/+2
| | | | llvm-svn: 370761
* [SystemZ] Recognize INLINEASM_BR in backend.Jonas Paulsson2019-09-031-2/+2
| | | | | | | | SystemZInstrInfo::analyzeBranch() needs to check for INLINEASM_BR instructions, or it will crash. Review: Ulrich Weigand llvm-svn: 370753
* [ARM] Ignore Implicit CPSR regs when lowering from Machine to MC operandsDavid Green2019-09-031-2/+2
| | | | | | | | | | | | | | The code here seems to date back to r134705, when tablegen lowering was first being added. I don't believe that we need to include CPSR implicit operands on the MCInst. This now works more like other backends (like AArch64), where all implicit registers are skipped. This allows the AliasInst for CSEL's to match correctly, as can be seen in the test changes. Differential revision: https://reviews.llvm.org/D66703 llvm-svn: 370745
* [SystemZ] Add support for fentry.Jonas Paulsson2019-09-032-0/+15
| | | | | | | SystemZAsmPrinter now properly emits function calls to __fentry__. Review: Ulrich Weigand llvm-svn: 370743
* [ARM] Invert CSEL predicates if the opposite is a simpler constant to ↵David Green2019-09-034-30/+75
| | | | | | | | | | | | | | | | | | | | materialise This moves ConstantMaterializationCost into ARMBaseInstrInfo so that it can also be used in ISel Lowering, adding codesize values to the computed costs, to be able to compare either approximate instruction counts or codesize costs. It also adds a HasLowerConstantMaterializationCost, which compares the ConstantMaterializationCost of two values, returning true if the first is smaller either in instruction count/codesize, or falling back to the other in the case that they are equal. This is used in constant CSEL lowering to invert the predicate if the opposite is easier to materialise. Differential revision: https://reviews.llvm.org/D66701 llvm-svn: 370741
* [ARM] Generate 8.1-m CSINC, CSNEG and CSINV instructions.David Green2019-09-036-1/+92
| | | | | | | | | | | | Arm 8.1-M adds a number of related CSEL instructions, including CSINC, CSNEG and CSINV. These choose between two values given the content in CPSR and a condition, performing an increment, negation or inverse of the false value. This adds some selection for them, either from constant values or patterns. It does not include CSEL directly, which is currently not always making code better. It is still useful, but we will have to check more carefully where it should and shouldn't be used. Code by Ranjeet Singh and Simon Tatham, with some modifications from me. Differential revision: https://reviews.llvm.org/D66483 llvm-svn: 370739
* [mips] Switch to the `.text` section after emitting asm file preambleSimon Atanasyan2019-09-031-0/+4
| | | | | | | | | | | | | | | | | | | | Now the last `.section` directive in the MIPS asm file preamble is the `.section .mdebug.abi`. If assembler code injected for example by the LLVM `module asm` or the C ` __asm` directives do not contain explicit switching to the `.text` section it goes to the `.mdebug.abi` section. It might be unexpected to the user and in fact for example breaks building some existing code like FreeBSD libc [1]. The patch forces switching to the `.text` section after emitting MIPS assembler file preamble. [1] https://bugs.llvm.org/show_bug.cgi?id=43119 Fix PR43119. Differential Revision: https://reviews.llvm.org/D67014 llvm-svn: 370735
* [ARM] Fix MVE ldst offset rangesDavid Green2019-09-031-19/+18
| | | | | | | | | | | | | | | We were using isShiftedInt<7, Shift>(RHSC) to detect the ranges of offsets to fold into MVE loads/stores. The instructions actually take a 7 bit unsigned integer which is either added or subtracted. So something more like isShiftedUInt<7, Shift>(abs(RHSC)). Instead I've changes this to use the isScaledConstantInRange method, same as in SelectT2AddrModeImm7Offset used by pre/post inc, which seemed to already be getting this correct. Differential revision: https://reviews.llvm.org/D66997 llvm-svn: 370731
* [ARM][MVE] Decoding of VMSR doesn't diagnose some unpredictable encodingsOliver Stannard2019-09-031-25/+29
| | | | | | | | | | | | | | | | Decoding of VMSR doesn't diagnose some unpredictable encodings, as the unpredictable bits are not correctly set. Diff-reduce this instruction's internals WRT VMRS so I can see the differences better. Mostly this is s/src/Rt/g. Fill in the "should-be-(0)" bits. Designate the Unpredictable{} bits for both VMRS and VMSR. Patch by Mark Murray! Differential revision: https://reviews.llvm.org/D66938 llvm-svn: 370729
* Bug fix on function epilog optimization (ARM backend)Oliver Stannard2019-09-031-2/+3
| | | | | | | | | | | | | | | To save a 'add sp,#val' instruction by adding registers to the final pop instruction, the first register transferred by this pop instruction need to be found. If the function to be optimized has a non-void return value, the operand list contains r0 (implicit) which prevents the optimization to take place. Therefore implicit register references should be skipped in the search loop, because this registers are never popped from the stack. Patch by Rainer Herbertz (rOptimizer)! Differential revision: https://reviews.llvm.org/D66730 llvm-svn: 370728
* [LV] Fix miscompiles by adding non-header PHI nodes to AllowedExitBjorn Pettersson2019-09-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Fold-tail currently supports reduction last-vector-value live-out's, but has yet to support last-scalar-value live-outs, including non-header phi's. As it relies on AllowedExit in order to detect them and bail out we need to add the non-header PHI nodes to AllowedExit, otherwise we end up with miscompiles. Solves https://bugs.llvm.org/show_bug.cgi?id=43166 Reviewers: fhahn, Ayal Reviewed By: fhahn, Ayal Subscribers: anna, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67074 llvm-svn: 370721
OpenPOWER on IntegriCloud