summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [DwarfDebug] Skip entries to big for 16 bit size field in Dwarf < 5.Florian Hahn2019-03-191-1/+7
| | | | | | | | | | | | | | | | Nothing prevents entries from being bigger than the 16 bit size field in Dwarf < 5. For entries that are too big, just emit an empty entry instead of crashing. This fixes PR41038. Reviewers: probinson, aprantl, davide Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D59518 llvm-svn: 356514
* [TailCallElim] Add tailcall elimination pass to LTO pipelinesRobert Lougher2019-03-192-0/+8
| | | | | | | | | | LTO provides additional opportunities for tailcall elimination due to link-time inlining and visibility of nocapture attribute. Testing showed negligible impact on compilation times. Differential Revision: https://reviews.llvm.org/D58391 llvm-svn: 356511
* Demanded elements support for masked.load and masked.gatherPhilip Reames2019-03-191-0/+20
| | | | | | | | Teach instcombine to propagate demanded elements through a masked load or masked gather instruction. This is in the broader context of improving vector pointer instcombine under https://reviews.llvm.org/D57140. Differential Revision: https://reviews.llvm.org/D57372 llvm-svn: 356510
* CodeGen: Refactor regallocator command line and target selectionMatt Arsenault2019-03-194-40/+58
| | | | | | | | | | This will allow targets more flexibility to replace the register allocator core passes. In a future commit, AMDGPU will run the core register assignment passes twice, and will also want to disallow using the standard -regalloc option. llvm-svn: 356506
* RegAllocFast: Do not allocate registers for undef usesMatt Arsenault2019-03-191-0/+48
| | | | | | | | | | Do not actually allocate a register for an undef use. Previously we we would create unnecessary reload instruction for undef uses where the register wasn't live. Patch by Matthias Braun llvm-svn: 356501
* RegAllocFast: Remove early selection loop, the spill calculation will report ↵Matt Arsenault2019-03-191-9/+1
| | | | | | | | | | | | | | | | cost 0 anyway for free regs The 2nd loop calculates spill costs but reports free registers as cost 0 anyway, so there is little benefit from having a separate early loop. Surprisingly this is not NFC, as many register are marked regDisabled so the first loop often picks up later registers unnecessarily instead of the first one available in the allocation order... Patch by Matthias Braun llvm-svn: 356499
* Fix for ABS legalization on PPC buildbot.Simon Pilgrim2019-03-192-2/+4
| | | | llvm-svn: 356498
* Allow unordered loads to be considered invariant in CodeGenPhilip Reames2019-03-192-3/+10
| | | | | | | | | | The actual code change is fairly straight forward, but exercising it isn't. First, it turned out we weren't adding the appropriate flags in SelectionDAG. Second, it turned out that we've got some optimization gaps, so obvious test cases don't work. My first attempt (in atomic-unordered.ll) points out a deficiency in our peephole-opt folding logic which I plan to fix separately. Instead, I'm exercising this through MachineLICM. Differential Revision: https://reviews.llvm.org/D59375 llvm-svn: 356494
* Revert "[Remarks] Add a new Remark / RemarkParser abstraction"Francis Visoiu Mistrih2019-03-196-622/+317
| | | | | | | | | This reverts commit 51dc6a8c84cd6a58562e320e1828a0158dbbf750. Breaks http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20034/steps/build%20stage%201/logs/stdio. llvm-svn: 356492
* [Remarks] Add a new Remark / RemarkParser abstractionFrancis Visoiu Mistrih2019-03-196-317/+622
| | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a Remark class that allows us to share code when working with remarks. The C API has been updated to reflect this. Instead of the parser generating C structs, it's now using a C++ object that is used through opaque pointers in C. This gives us much more flexibility on what changes we can make to the internal state of the object and interacts much better with scenarios where the library is used through dlopen. * C API updates: * move from C structs to opaque pointers and functions * the remark type is now an enum instead of a string * unit tests updates: * use mostly the C++ API * keep one test for the C API * rename to YAMLRemarksParsingTest * a typo was fixed: AnalysisFPCompute -> AnalysisFPCommute. * a new error message was added: "expected a remark tag." * llvm-opt-report has been updated to use the C++ parser instead of the C API Differential Revision: https://reviews.llvm.org/D59049 llvm-svn: 356491
* [ValueTracking] Use computeConstantRange() for unsigned add/sub overflowNikita Popov2019-03-191-14/+25
| | | | | | | | | | | | | | | | | | Improve computeOverflowForUnsignedAdd/Sub in ValueTracking by intersecting the computeConstantRange() result into the ConstantRange created from computeKnownBits(). This allows us to detect some additional never/always overflows conditions that can't be determined from known bits. This revision also adds basic handling for constants to computeConstantRange(). Non-splat vectors will be handled in a followup. The signed case will also be handled in a followup, as it needs some more groundwork. Differential Revision: https://reviews.llvm.org/D59386 llvm-svn: 356489
* [X86][SSE] SimplifyDemandedVectorEltsForTargetNode - handle repeated shift ↵Simon Pilgrim2019-03-191-1/+11
| | | | | | | | amounts If a value with multiple uses is only ever used for SSE shift amounts then we know that only the bottom 64-bits are needed. llvm-svn: 356483
* [AtomicExpand] Fix a crash bug when lowering unordered loads to cmpxchgPhilip Reames2019-03-191-0/+3
| | | | | | Add tests for wider atomic loads and stores. In the process, fix a crasher where we appearently handled unorder stores, but not loads, when lowering to cmpxchg idioms. llvm-svn: 356482
* [MIPS][microMIPS] Enable dynamic stack realignmentSimon Atanasyan2019-03-191-2/+2
| | | | | | | | | | | | Dynamic stack realignment was disabled on micromips by checking if target has standard encoding. We simply change the condition to skip Mips16 only. Patch by Mirko Brkusanin. Differential Revision: http://reviews.llvm.org/D59499 llvm-svn: 356478
* [NFC] Fix unused variable in release buildsJordan Rupprecht2019-03-191-2/+2
| | | | | | This was introduced in rL356468. llvm-svn: 356477
* [DAGCombine] Fix a miscompile when reducing BUILD_VECTORs to a shuffleJustin Bogner2019-03-191-11/+9
| | | | | | | | | | | | | | | | | | In r311255 we added a case where we split vectors whose elements are all derived from the same input vector so that we could shuffle it more efficiently. In doing so, createBuildVecShuffle was taught to adjust for the fact that all indices would be based off of the first vector when this happens, but it's possible for the code that checked that to fire incorrectly if we happen to have a BUILD_VECTOR of extracts from subvectors and don't hit this new optimization. Instead of trying to detect if we've split the vector by checking if we have extracts from the same base vector, we can just pass that information into createBuildVecShuffle, avoiding the miscompile. Differential Revision: https://reviews.llvm.org/D59507 llvm-svn: 356476
* Fix unused variable warning. NFCI.Simon Pilgrim2019-03-191-2/+1
| | | | llvm-svn: 356474
* [InstCombine] fold logic-of-nan-fcmps (PR41069)Sanjay Patel2019-03-191-0/+52
| | | | | | | | | | | | | | | Combine 2 fcmps that are checking for nan-ness: and (fcmp ord X, 0), (and (fcmp ord Y, 0), Z) --> and (fcmp ord X, Y), Z or (fcmp uno X, 0), (or (fcmp uno Y, 0), Z) --> or (fcmp uno X, Y), Z This is an exact match for a minimal reassociation pattern. If we want to handle this more generally that should go in the reassociate pass and allow removing this code. This should fix: https://bugs.llvm.org/show_bug.cgi?id=41069 llvm-svn: 356471
* [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in ↵Simon Pilgrim2019-03-1910-104/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468
* [AMDGPU] Add buffer/load 8/16 bit overloaded intrinsicsRyan Taylor2019-03-196-1/+170
| | | | | | | | | | | | | | | Summary: Add buffer store/load 8/16 overloaded intrinsics for buffer, raw_buffer and struct_buffer Change-Id: I166a29f071b2ff4e4683fb0392564b1f223ac61d Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59265 llvm-svn: 356465
* [AMDGPU] Ban i8 min3 promotion.Neil Henning2019-03-191-3/+3
| | | | | | | | | | | | | | | | I found this really weird WWM-related case whereby through the WWM transformations our isel lowering was trying to promote 2 min's into a min3 for the i8 type, which our hardware doesn't support. The new min3_i8.ll test case would previously spew the error: PromoteIntegerResult #0: t69: i8 = SMIN3 t70, Constant:i8<0>, t68 Before the simple fix to our isel lowering to not do it for i8 MVT's. Differential Revision: https://reviews.llvm.org/D59543 llvm-svn: 356464
* [mips] Fix crash on recursive using of .setSimon Atanasyan2019-03-191-10/+9
| | | | | | | | | | | | | Switch to the `MCParserUtils::parseAssignmentExpression` for parsing assignment expressions in the `.set` directive reduces code and allows to print an error message instead of crashing in case of incorrect recursive using of the `.set`. Fix for the bug https://bugs.llvm.org/show_bug.cgi?id=41053. Differential Revision: http://reviews.llvm.org/D59452 llvm-svn: 356461
* [InstSimplify] SimplifyICmpInst - icmp eq/ne %X, undef -> undefSimon Pilgrim2019-03-191-0/+7
| | | | | | | | | | | | | | | As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand: When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction) When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst). Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef. The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968. Differential Revision: https://reviews.llvm.org/D59541 llvm-svn: 356456
* [DebugInfoMetadata] Move main subprogram DIFlag into DISPFlagsPetar Jovanovic2019-03-191-7/+28
| | | | | | | | | | | | Moving subprogram specific flags into DISPFlags makes IR code more readable. In addition, we provide free space in DIFlags for other 'non-subprogram-specific' debug info flags. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D59288 llvm-svn: 356454
* [DebugInfo] Introduce DW_OP_LLVM_convertMarkus Lavin2019-03-1923-59/+263
| | | | | | | | | | | | | | | | | | | | | Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows for a convenient way to perform type conversions on the Dwarf expression stack. As an additional bonus it paves the way for using other Dwarf v5 ops that need to reference a base_type. The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp to perform sext/zext on debug values but mainly the patch is about preparing terrain for adding other Dwarf v5 ops that need to reference a base_type. For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a complex shift & mask pattern is generated to emulate sext/zext. This is a recommit of r356442 with trivial fixes for the failing tests. Differential Revision: https://reviews.llvm.org/D56587 llvm-svn: 356451
* Revert "[DebugInfo] Introduce DW_OP_LLVM_convert"Markus Lavin2019-03-1923-262/+59
| | | | | | | | | | | | | This reverts commit 1cf4b593a7ebd666fc6775f3bd38196e8e65fafe. Build bots found failing tests not detected locally. Failing Tests (3): LLVM :: DebugInfo/Generic/convert-debugloc.ll LLVM :: DebugInfo/Generic/convert-inlined.ll LLVM :: DebugInfo/Generic/convert-linked.ll llvm-svn: 356444
* [DebugInfo] Introduce DW_OP_LLVM_convertMarkus Lavin2019-03-1923-59/+262
| | | | | | | | | | | | | | | | | | | Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows for a convenient way to perform type conversions on the Dwarf expression stack. As an additional bonus it paves the way for using other Dwarf v5 ops that need to reference a base_type. The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp to perform sext/zext on debug values but mainly the patch is about preparing terrain for adding other Dwarf v5 ops that need to reference a base_type. For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a complex shift & mask pattern is generated to emulate sext/zext. Differential Revision: https://reviews.llvm.org/D56587 llvm-svn: 356442
* [WebAssembly] Small improvements in FixIrreducibleControlFlow (NFC)Heejin Ahn2019-03-191-24/+16
| | | | | | | | | | | | | | | | | Summary: - Make some class member methods const - Delete unnecessary includes - Use a simpler form of `BuildMI` Reviewers: kripken Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59454 llvm-svn: 356440
* [WebAssembly] Rename methods according to instruction name changes (NFC)Heejin Ahn2019-03-191-7/+7
| | | | | | | | | | | | Reviewers: tlively, sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59469 llvm-svn: 356438
* [WebAssembly] Lower SIMD nnan setcc nodesThomas Lively2019-03-191-10/+12
| | | | | | | | | | | | | | | | | | Summary: Adds patterns to lower all the remaining setcc modes: lt, gt, le, and ge. Fixes PR40912. Reviewers: aheejin, sbc100, dschuff Reviewed By: dschuff Subscribers: jgravelle-google, hiraditya, sunfish, jdoerfert, llvm-commits, srj Tags: #llvm Differential Revision: https://reviews.llvm.org/D59519 llvm-svn: 356431
* Revert "[ValueTracking][InstSimplify] Support min/max selects in ↵Nikita Popov2019-03-181-22/+1
| | | | | | | | | | computeConstantRange()" This reverts commit 106f0cdefb02afc3064268dc7a71419b409ed2f3. This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests. llvm-svn: 356424
* [X86] Disable CQTO and CLTQ instructions in the assembly parser outside ↵Craig Topper2019-03-181-2/+2
| | | | | | 64-bit mode. llvm-svn: 356419
* [ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()Nikita Popov2019-03-181-1/+22
| | | | | | | | | | | | Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This was suggested by spatel as an alternative approach to D59378. I've also added the infinite looping test from that revision here. Differential Revision: https://reviews.llvm.org/D59506 llvm-svn: 356415
* [X86] Allow any 8-bit immediate to be used with BT/BTC/BTR/BTS not just sign ↵Craig Topper2019-03-181-30/+46
| | | | | | | | extended 8-bit immediates. We need to allow [128,255] in addition to [-128, 127] to match gas. llvm-svn: 356413
* [GlobalISel] Include missing change from r356396Amara Emerson2019-03-181-4/+2
| | | | | | Forgot to add a change to relax some asserts in r356396. llvm-svn: 356411
* [WebAssembly] Don't override default implementation of isOffsetFoldingLegal. ↵Sam Clegg2019-03-183-8/+1
| | | | | | | | | | | | | | | NFC. The default implementation does we want and is going to more compatible with dynamic linking (-fPIC) support that is planned. This is NFC because currently we only build wasm with `-relocation-model=static` which in turn means that the default `isOffsetFoldingLegal` always returns true today. Differential Revision: https://reviews.llvm.org/D54661 llvm-svn: 356410
* [ValueTracking][InstSimplify] Move abs handling into computeConstantRange(); NFCNikita Popov2019-03-182-41/+32
| | | | | | | | | | | This is preparation for D59506. The InstructionSimplify abs handling is moved into computeConstantRange(), which is the general place for such calculations. This is NFC and doesn't affect the existing tests in test/Transforms/InstSimplify/icmp-abs-nabs.ll. Differential Revision: https://reviews.llvm.org/D59511 llvm-svn: 356409
* [X86] Use relocImm in the ROL8ri/ROL16ri/ROL32ri/ROL64ri patterns to be ↵Craig Topper2019-03-181-4/+6
| | | | | | consistent with the ROR patterns. llvm-svn: 356407
* [X86] Replace uses of i64immSExt32_su with i64relocImmSExt32_su.Craig Topper2019-03-183-7/+2
| | | | | | For the i8, i16, and i32 instructions we were using a relocImm. Presumably we should for i64 as well. llvm-svn: 356406
* [AMDGPU] Enable code selection using `s_mul_hi_u32`/`s_mul_hi_i32`.Michael Liao2019-03-182-2/+10
| | | | | | | | | | Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59501 llvm-svn: 356405
* [llvm-objcopy] Make .build-id linking atomicJake Ehrlich2019-03-181-26/+27
| | | | | | | | This change makes linking into .build-id atomic and safe to use. Some users under particular workflows are reporting that this races more than half the time under particular conditions. llvm-svn: 356404
* [AMDGPU] Asm/disasm clamp modifier on vop3 int arithmeticTim Renouf2019-03-186-33/+69
| | | | | | | | | | | | | | Allow the clamp modifier on vop3 int arithmetic instructions in assembly and disassembly. This involved adding a clamp operand to the affected instructions in MIR and MC, and thus having to fix up several places in codegen and MIR tests. Differential Revision: https://reviews.llvm.org/D59267 Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e llvm-svn: 356399
* [AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiersTim Renouf2019-03-189-46/+127
| | | | | | | | | | | | | | | | | This commit allows v_cndmask_b32_e64 with abs, neg source modifiers on src0, src1 to be assembled and disassembled. This does appear to be allowed, even though they are floating point modifiers and the operand type is b32. To do this, I added src0_modifiers and src1_modifiers to the MachineInstr, which involved fixing up several places in codegen and mir tests. Differential Revision: https://reviews.llvm.org/D59191 Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea llvm-svn: 356398
* Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy()Amara Emerson2019-03-182-11/+18
| | | | | | | | | | | | | After review comments, it was preferred to not teach MachineIRBuilder about non-generic instructions beyond using buildInstr(). For AArch64 I've changed the buildCopy() calls to buildInstr() + a separate addReg() call. This also relaxes the MachineIRBuilder's COPY checking more because it may not always have a SrcOp given to it. llvm-svn: 356396
* [DebugInfo][PDB] Don't write empty debug streamsAlexandre Ganea2019-03-182-6/+16
| | | | | | | | | | | Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count). With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention. Also fix the * Linker * contrib section which wasn't correctly emitted previously. Differential Revision: https://reviews.llvm.org/D59502 llvm-svn: 356395
* [MsgPack][AMDGPU] Fix unflushed raw_string_ostream bugs on windows expensive ↵Tim Renouf2019-03-182-3/+5
| | | | | | | | | | | | | checks bot This fixes a couple of unflushed raw_string_ostream bugs in recent commits that only show up on a bot building on windows with expensive checks. Differential Revision: https://reviews.llvm.org/D59396 Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf llvm-svn: 356394
* [X86] Rename imm8_su/imm16_su/imm32_su to ↵Craig Topper2019-03-182-9/+9
| | | | | | relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are. llvm-svn: 356393
* [SCEV] Guard movement of insertion point for loop-invariantsWarren Ristow2019-03-181-41/+47
| | | | | | | | | | | | | | | | | | | | | This reinstates r347934, along with a tweak to address a problem with PHI node ordering that that commit created (or exposed). (That commit was reverted at r348426, due to the PHI node issue.) Original commit message: r320789 suppressed moving the insertion point of SCEV expressions with dev/rem operations to the loop header in non-loop-invariant situations. This, and similar, hoisting is also unsafe in the loop-invariant case, since there may be a guard against a zero denominator. This is an adjustment to the fix of r320789 to suppress the movement even in the loop-invariant case. This fixes PR30806. Differential Revision: https://reviews.llvm.org/D57428 llvm-svn: 356392
* [AArch64] Small fix for getIntImmCostAdhemerval Zanella2019-03-181-2/+4
| | | | | | | | | | | It uses the generic AArch64_IMM::expandMOVImm to get the correct number of instruction used in immediate materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58461 llvm-svn: 356391
* [AArch64] Optimize floating point materializationAdhemerval Zanella2019-03-181-3/+13
| | | | | | | | | | | | | | | | | | This patch follows some ideas from r352866 to optimize the floating point materialization even further. It changes isFPImmLegal to considere up to 2 mov instruction or up to 5 in case subtarget has fused literals. The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but the mov+fmov sequence is always better because of the reduced d-cache pressure. The timings are still the same if you consider movw+movk+fmov vs. adrp+ldr will be fused (although one instruction longer). Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58460 llvm-svn: 356390
OpenPOWER on IntegriCloud