summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [GlobalISel][AArch64] LegalizerInfo verifier: Fixing bugs exposed by ↵Roman Tereshin2018-05-311-11/+4
| | | | | | | | | | | | LegalizerInfo::verify(...) Reviewers: aemerson, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D46339 llvm-svn: 333618
* [InstCombine] don't change the size of a select if it would mismatch its ↵Sanjay Patel2018-05-311-4/+10
| | | | | | | | | | | | | | | | | | | | | condition operands' sizes Don't always: cast (select (cmp x, y), z, C) --> select (cmp x, y), (cast z), C' This is something that came up as far back as D26556, and I lost track of it. I suspect that this transform is part of the underlying problem that is inspiring some of the recent proposals that seek to match larger patterns that include a cast op. Even if that's not true, this transform causes problems for codegen (particularly with vector types). A transform to actively match the size of cmp and select operand sizes should follow. This patch just removes the harmful canonicalization in the other direction. Differential Revision: https://reviews.llvm.org/D47163 llvm-svn: 333611
* [InstCombine] don't negate constant expression with fsub (PR37605)Sanjay Patel2018-05-301-1/+3
| | | | | | | X + (-C) would be transformed back into X - C, so infinite loop: https://bugs.llvm.org/show_bug.cgi?id=37605 llvm-svn: 333610
* AMDGPU: Split AMDGPUTTI into GCNTTI and R600TTITom Stellard2018-05-304-42/+212
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D47359 llvm-svn: 333605
* [LowerTypeTests] Discard extern_weak linkage for definitionsVlad Tsyrklevich2018-05-301-0/+5
| | | | | | | | | | | | | | | | | | Summary: Fix PR37625. It's possible for an extern_weak declaration to be emitted to the merged module when a definition exists in the ThinLTO portion of the build; discard the linkage on the declaration in that case. (otherwise we copy the linkage to the alias to the jumptable and fail) Reviewers: pcc Reviewed By: pcc Subscribers: mehdi_amini, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D47494 llvm-svn: 333604
* [NewGVN] Fix set comparison; reflow commentGeorge Burgess IV2018-05-301-7/+8
| | | | | | | | | | | | | | | | Looks like we intended to compare this->Members with Other->Members here, but ended up comparing this->Members with this->Members. Oops. :) Since CongruenceClass::Members is a SmallPtrSet anyway, we can probably skip building std::sets if we're willing to write a bit more code. This appears to be no functional change (for sufficiently lax values of "no"): this equality check was only being called inside of an assert. So, worst case, we'll catch more bugs in the form of assertion failures. Thanks to d0k for noting this! llvm-svn: 333601
* [GlobalISel][AArch64] LegalizerInfo verifier: Adding ↵Roman Tereshin2018-05-301-0/+1
| | | | | | | | | | | | | | | LegalizerInfo::verify(...) call w/o fixing bugs This is to make it clear what kind of bugs the LegalizerInfo::verifier is able to catch and test its output Reviewers: aemerson, qcolombet Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D46338 llvm-svn: 333597
* [IRBuilder] Add APIs for creating calls to atomic memmove and memset ↵Daniel Neilson2018-05-301-0/+70
| | | | | | | | | | | | | | intrinsics. (NFC) Summary: Creating the IRBuilder methods: CreateElementUnorderedAtomicMemSet CreateElementUnorderedAtomicMemMove These mirror the methods that create calls to the regular (non-atomic) memmove and memset intrinsics. llvm-svn: 333588
* [CalledValuePropagation] Just use a sorted vector instead of a set.Benjamin Kramer2018-05-301-9/+11
| | | | | | | | | The set properties are never used, so a vector is enough. No functionality change intended. While there add some std::moves to SparseSolver. llvm-svn: 333582
* [X86][SSE] Pulled out splat detection helper from LowerScalarVariableShift ↵Simon Pilgrim2018-05-301-37/+40
| | | | | | | | (NFCI) Created the IsSplatValue helper from the splat detection code in LowerScalarVariableShift as a first NFC step towards improving support for splat rotations, which is an extension of PR37426. llvm-svn: 333580
* [GlobalISel][Legalizer] LegalizerInfo verifier: check rules cover type indicesRoman Tereshin2018-05-301-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a simple verifier that tracks type indices being touched by legalization rules' builders. Every target will now have an opportunity to call LegalizerInfo::verify(...) at the end of its derived LegalizerInfo's constructor and check there are no obvious mistakes like checking only first type for an opcode that has more than one type index and therefore implicitly declaring any type for the second (and higher) type index legal. The check is only ran in assert builds and should have very minor performance impact in assert builds and none in release builds. This commit does not add LegalizerInfo::verify(...) calls to target-specific legalizers, look for separate commits for that. This commit also doesn't make the verification errors fatal, only produces an error message, look for a later commit that does. Reviewers: aemerson, qcolombet Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D46338 llvm-svn: 333576
* [dsymutil] Escape HTML special characters in plist.Jonas Devlieghere2018-05-301-0/+17
| | | | | | | | | | When printing string in the Plist, we weren't escaping the characters which lead to invalid XML. This patch adds the escape logic to StringExtras. rdar://39785334 llvm-svn: 333565
* [AMDGPU][Waitcnt] Fix build error: unused variable 'SWaitInst'Mark Searles2018-05-301-6/+2
| | | | | | | | | | | | | | https://reviews.llvm.org/rL333556 caused a buildbot failure. See http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/21876/steps/build_Lld/logs/stdio /Users/buildslave/as-bldslv9/lld-x86_64-darwin13/llvm.src/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:2007:10: error: unused variable 'SWaitInst' [-Werror,-Wunused-variable] auto SWaitInst = BuildMI(EntryBB, EntryBB.getFirstNonPHI(), The unused variable was for debugging purposes; removing that piece of code to fix the build. llvm-svn: 333559
* AMDGPU: Use better alignment for kernarg loweringMatt Arsenault2018-05-302-17/+22
| | | | | | | | | | This was just emitting loads with the ABI alignment for the raw type. The true alignment is often better, especially when an illegal vector type was scalarized. The better alignment allows using a scalar load more often. llvm-svn: 333558
* [ValueTracking] Fix endless recursion in isKnownNonZero()Karl-Johan Karlsson2018-05-301-4/+5
| | | | | | | | | | | | | | | | | | | | | Summary: The isKnownNonZero() function have checks that abort the recursion when it reaches the specified max depth. However one of the recursive calls was placed before the max depth check was done, resulting in a endless recursion that eventually triggered a segmentation fault. Fixed the problem by moving the max depth check above the first recursive call. Reviewers: Prazek, nlopes, spatel, craig.topper, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bjope, llvm-commits Differential Revision: https://reviews.llvm.org/D47531 llvm-svn: 333557
* [AMDGPU][Waitcnt] Fix handling of loops with many bottom blocksMark Searles2018-05-301-19/+23
| | | | | | | | | | | | | In terms of waitcnt insertion/if necessary, the waitcnt pass forces convergence for a loop. Previously, that kicked if greater than 2 passes over a loop, which doesn't account for loop with many bottom blocks. So, increase the threshold to (n+1), where n is the number of bottom blocks. This gives the pass an opportunity to consider the contribution of each bottom block, to the overall loop, before the forced convergence potentially kicks in. Differential Revision: https://reviews.llvm.org/D47488 llvm-svn: 333556
* [X86] Lowering FMA intrinsics to native IR (LLVM part)Gabor Buella2018-05-302-20/+101
| | | | | | | | | | | | | | | | | | | | | | | | Support for Clang lowering of fused intrinsics. This patch: 1. Removes bindings to clang fma intrinsics. 2. Introduces new LLVM unmasked intrinsics with rounding mode: int_x86_avx512_vfmadd_pd_512 int_x86_avx512_vfmadd_ps_512 int_x86_avx512_vfmaddsub_pd_512 int_x86_avx512_vfmaddsub_ps_512 supported with a new intrinsic type (INTR_TYPE_3OP_RM). 3. Introduces new x86 fmaddsub/fmsubadd folding. 4. Introduces new tests for code emitted by sequentions introduced in Clang part. Patch by tkrupa Reviewers: craig.topper, sroland, spatel, RKSimon Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D47443 llvm-svn: 333554
* [AliasSet] Teach the alias set how to handle atomic memcpy/memmove/memsetDaniel Neilson2018-05-301-8/+11
| | | | | | | | | Summary: The atomic variants of the memcpy/memmove/memset intrinsics can be treated the same was as the regular forms, with respect to aliasing. Update the AliasSetTracker to treat the atomic forms the same was as the regular forms. llvm-svn: 333551
* [InstCombine, ARM, AArch64] Convert table lookup to shuffle vectorAlexandros Lamprineas2018-05-301-0/+46
| | | | | | | | | | | Turning a table lookup intrinsic into a shuffle vector instruction can be beneficial. If the mask used for the lookup is the constant vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse instructions instead. Differential Revision: https://reviews.llvm.org/D46133 llvm-svn: 333550
* [ARM] Remove code handling ADDC/ADDE/SUBC/SUBEAmaury Sechet2018-05-301-30/+0
| | | | | | | | | | | | Summary: This code is now dead as the ARM backend uses ADDCARRY/SUBCARRY/SETCCCARRY . Reviewers: rogfer01, efriedma, rengolin, javed.absar Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D47413 llvm-svn: 333544
* [Hexagon] Use vector align-left when shift amount fits in 3 bitsKrzysztof Parzyszek2018-05-301-6/+11
| | | | | | | This saves an instruction because for align-right the shift amount would need to be put in a register first. llvm-svn: 333543
* [mips] Correct the definition of CTC2/CFC2Simon Dardis2018-05-301-8/+6
| | | | llvm-svn: 333542
* [mips] Correct the predicates of microMIPS compact branch instructionsSimon Dardis2018-05-301-6/+4
| | | | llvm-svn: 333541
* [mips] Sink PredicateControl further down the class hierarchy.Simon Dardis2018-05-3011-68/+59
| | | | | | | | | | | | | | | | Previously PredicateControl in some cases was a member of <X>Inst classes for some X (DSP, EVA) or was in more irregular place in the hierarchry for any given instruction. This patch moves PredicateControl down to the root so that it is consistently available. Then correct the base class of microMIPS instructions as using EncodingPredicates instead of the general Predicates field of Instruction. Reviewers: smaksimovic, abeserminji, atanasyan Differential Revision: https://reviews.llvm.org/D47526 llvm-svn: 333536
* [mips] Correct the predicates of arithmetic and logic instructions.Simon Dardis2018-05-303-45/+79
| | | | | | | | | | | | As part of this effort, duplicate and correct the predicates of some aliases. Also disable code generation of some short form instructions for FastISel, as it would otherwise reject them. Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D47075 llvm-svn: 333530
* AArch64: print correct annotation for ADRP addresses.Tim Northover2018-05-301-2/+2
| | | | | | | The immediate on an ADRP MCInst needs to be multiplied by 0x1000 to obtain the actual PC-offset that will be calculated. llvm-svn: 333525
* [AArch64][AsmParser] Fix segfault on illegal fpimm.Sander de Smalen2018-05-301-2/+2
| | | | | | | | | | | | | Floating point immediate combining a negative sign and a hexadecimal number, e.g. #-0x0 caused the compiler to crash. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47483 llvm-svn: 333524
* [Sparc] Treat %fxx registers with value type Other as single precisionDaniel Cederman2018-05-301-1/+1
| | | | | | | They get type Other when used in the clobber list in inline assembly. This fixes tests fp128.ll and float.ll that failed after r333512. llvm-svn: 333523
* Revert commit 333506Serge Pavlov2018-05-306-14/+40
| | | | | | | It looks like this commit is responsible for the fail: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/24382. llvm-svn: 333518
* [Sparc] Select correct register class for FP register constraintsDaniel Cederman2018-05-301-0/+16
| | | | | | | | | | | | | | | | | Summary: The fX version of floating-point registers only supports single precision. We need to map the name to dX for doubles and qX for long doubles if we want getRegForInlineAsmConstraint() to be able to pick the correct register class. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D47258 llvm-svn: 333512
* [X86] Add unmasked AVX512VNNI instrinsics. Use a select in IR instead.Craig Topper2018-05-301-0/+14
| | | | | | A future patch will remove the old masked intrinsics. llvm-svn: 333508
* Use uniform mechanism for OOM errors handlingSerge Pavlov2018-05-306-40/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a recommit of r333390, which was reverted in r333395, because it caused cyclic dependency when building shared library `LLVMDemangle.so`. In this commit `ItaniumDemangler.cpp` was not changed. The original commit message is below. In r325551 many calls of malloc/calloc/realloc were replaces with calls of their safe counterparts defined in the namespace llvm. There functions generate crash if memory cannot be allocated, such behavior facilitates handling of out of memory errors on Windows. If the result of *alloc function were checked for success, the function was not replaced with the safe variant. In these cases the calling function made the error handling, like: T *NewElts = static_cast<T*>(malloc(NewCapacity*sizeof(T))); if (NewElts == nullptr) report_bad_alloc_error("Allocation of SmallVector element failed."); Actually knowledge about the function where OOM occurred is useless. Moreover having a single entry point for OOM handling is convenient for investigation of memory problems. This change removes custom OOM errors handling and replaces them with calls to functions `llvm::safe_*alloc`. Declarations of `safe_*alloc` are moved to a separate include file, to avoid cyclic dependency in SmallVector.h Differential Revision: https://reviews.llvm.org/D47440 llvm-svn: 333506
* [PowerPC] fix broken JIT-compiled code with tail call optimizationHiroshi Inoue2018-05-301-2/+3
| | | | | | | | | The relocation for branch instructions in the dynamic loader of ExecutionEngine assumes branch instructions with R_PPC64_REL24 relocation type are only bl. However, with the tail call optimization, b instructions can be also used to jump into another function. This patch makes the relocation to keep bits in the branch instruction other than the jump offset to avoid relocation rewrites a b instruction into bl. Differential Revision: https://reviews.llvm.org/D47456 llvm-svn: 333502
* MC: Remove redundant substr() callSam Clegg2018-05-302-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D47047 llvm-svn: 333496
* [WebAssembly] MC: Add compile-twice test and fix corresponding bugSam Clegg2018-05-302-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D47398 llvm-svn: 333494
* [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, scheduleChandler Carruth2018-05-303-34/+61
| | | | | | | | | | | | | | | | | | | | | | loop-cleanup passes at the beginning of the loop pass pipeline, and re-enqueue loops after even trivial unswitching. This will allow us to much more consistently avoid simplifying code while doing trivial unswitching. I've also added a test case that specifically shows effective iteration using this technique. I've unconditionally updated the new PM as that is always using the SimpleLoopUnswitch pass, and I've made the pipeline changes for the old PM conditional on using this new unswitch pass. I added a bunch of comments to the loop pass pipeline in the old PM to make it more clear what is going on when reviewing. Hopefully this will unblock doing *partial* unswitching instead of just full unswitching. Differential Revision: https://reviews.llvm.org/D47408 llvm-svn: 333493
* [ORC] Fix an ambiguous make_unique call.Lang Hames2018-05-301-2/+2
| | | | llvm-svn: 333492
* [ORC] Update JITCompileCallbackManager to support multi-threaded code.Lang Hames2018-05-304-27/+110
| | | | | | | | | Previously JITCompileCallbackManager only supported single threaded code. This patch embeds a VSO (see include/llvm/ExecutionEngine/Orc/Core.h) in the callback manager. The VSO ensures that the compile callback is only executed once and that the resulting address cached for use by subsequent re-entries. llvm-svn: 333490
* [RISCV] Support resolving fixup_riscv_call and add to MCFixupKindInfo tableShiva Chen2018-05-301-1/+12
| | | | | | | | | | | | Resolving fixup_riscv_call by assembler when the linker relaxation diabled and the function and callsite within the same compile unit. And also adding static_assert after Infos array declaration to avoid missing any new fixup in MCFixupKindInfo in the future. Differential Revision: https://reviews.llvm.org/D47126 llvm-svn: 333487
* [VPlan] Replace LLVM_ATTRIBUTE_USED with ifndef NDEBUGDiego Caballero2018-05-291-2/+3
| | | | | | | | | | | | | Minor replacement. LLVM_ATTRIBUTE_USED was introduced to silence a warning but using #ifndef NDEBUG makes more sense in this case. Reviewers: dblaikie, fhahn, hsaito Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D47498 llvm-svn: 333476
* [X86] Remove some of the extractelts from the new MOVSS+FMA patterns.Craig Topper2018-05-292-46/+55
| | | | | | | | We only need the extractelt that corresponds to the register we're trying to insert back into. We can't guarantee the others haven't been optimized out depending on how those operands were produced. So instead just look for an FR32/FR64 input and emit a COPY_TO_REGCLASS to VR128 in the output pattern. This matches what we do for ADD/SUB/MUL/DIV. llvm-svn: 333473
* [X86] Use VR128X instead of VR128 in EVEX instruction patterns.Craig Topper2018-05-291-23/+23
| | | | llvm-svn: 333464
* [X86] Rename the operands in the recently introduced MOVSS+FMA patterns so ↵Craig Topper2018-05-292-24/+24
| | | | | | | | that the operand names in the output pattern are always in 1, 2, 3 order since those are the operand names in the instruction. The order should be controlled in the input pattern. llvm-svn: 333463
* Fix build error introduced in rL333459Sam Clegg2018-05-291-2/+3
| | | | | | The DEBUG macro was renamed LLVM_DEBUG. llvm-svn: 333462
* [LoopInstSimplify] Re-implement the core logic of loop-instsimplify toChandler Carruth2018-05-292-115/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | be both simpler and substantially more efficient. Rather than use a hand-rolled iteration technique that isn't quite the same as RPO, use the pre-built RPO loop body traversal utility. Once visiting the loop body in RPO, we can assert that we visit defs before uses reliably. When this is the case, the only need to iterate is when simplifying a def that is used by a PHI node along a back-edge. With this patch, the first pass over the loop body is just a complete simplification of every instruction across the loop body. When we encounter a use of a simplified instruction that stems from a PHI node in the loop body that has already been visited (due to some cyclic CFG, potentially the loop itself, or a nested loop, or unstructured control flow), we recall that specific PHI node for the second iteration. Nothing else needs to be preserved from iteration to iteration. On the second and later iterations, only instructions known to have simplified inputs are considered, each time starting from a set of PHIs that had simplified inputs along the backedges. Dead instructions are collected along the way, but deleted in a batch at the end of each iteration making the iterations themselves substantially simpler. This uses a new batch API for recursively deleting dead instructions. This alsa changes the routine to visit subloops. Because simplification is fundamentally transitive, we may need to visit the entire loop body, including subloops, to handle knock-on simplification. I've added a basic test file that helps demonstrate that all of these changes work. It includes both straight-forward loops with simplifications as well as interesting PHI-structures, CFG-structures, and a nested loop case. Differential Revision: https://reviews.llvm.org/D47407 llvm-svn: 333461
* [X86] Fix a potential crash that occur after r333419.Craig Topper2018-05-291-2/+1
| | | | | | The code could issue a truncate from a small type to larger type. We need to extend in that case instead. llvm-svn: 333460
* [WebAssembly] Add more error checking to object file parsingSam Clegg2018-05-291-224/+240
| | | | | | | | | | This should address some of the assert failures the fuzzer has been finding such as: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6719 Differential Revision: https://reviews.llvm.org/D47086 llvm-svn: 333459
* AMDGPU: Fix typo in option descriptionMatt Arsenault2018-05-291-1/+1
| | | | llvm-svn: 333457
* AMDGPU: Round up kernel argument allocation sizeMatt Arsenault2018-05-293-5/+13
| | | | | | | | | | AFAIK the driver's allocation will actually have to round this up anyway. It is useful to track the rounded up size, so that the end of the kernel segment is known to be dereferencable so a wider s_load_dword can be used for a short argument at the end of the segment. llvm-svn: 333456
* [RISCV] Add peepholes for Global Address lowering patternsSameer AbuAsal2018-05-291-0/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Base and offset are always separated when a GlobalAddress node is lowered (rL332641) as an optimization to reduce instruction count. However, this optimization is not profitable if the Global Address ends up being used in only instruction. This patch adds peephole optimizations that merge an offset of an address calculation into the LUI %%hi and ADD %lo of the lowering sequence. The peephole handles three patterns: 1) ADDI (ADDI (LUI %hi(global)) %lo(global)), offset ---> ADDI (LUI %hi(global + offset)) %lo(global + offset). This generates: lui a0, hi (global + offset) add a0, a0, lo (global + offset) Instead of lui a0, hi (global) addi a0, hi (global) addi a0, offset This pattern is for cases when the offset is small enough to fit in the immediate filed of ADDI (less than 12 bits). 2) ADD ((ADDI (LUI %hi(global)) %lo(global)), (LUI hi_offset)) ---> offset = hi_offset << 12 ADDI (LUI %hi(global + offset)) %lo(global + offset) Which generates the ASM: lui a0, hi(global + offset) addi a0, lo(global + offset) Instead of: lui a0, hi(global) addi a0, lo(global) lui a1, (offset) add a0, a0, a1 This pattern is for cases when the offset doesn't fit in an immediate field of ADDI but the lower 12 bits are all zeros. 3) ADD ((ADDI (LUI %hi(global)) %lo(global)), (ADDI lo_offset, (LUI hi_offset))) ---> offset = global + offhi20<<12 + offlo12 ADDI (LUI %hi(global + offset)) %lo(global + offset) Which generates the ASM: lui a1, %hi(global + offset) addi a1, %lo(global + offset) Instead of: lui a0, hi(global) addi a0, lo(global) lui a1, (offhi20) addi a1, (offlo12) add a0, a0, a1 This pattern is for cases when the offset doesn't fit in an immediate field of ADDI and both the lower 1 bits and high 20 bits are non zero. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang llvm-svn: 333455
OpenPOWER on IntegriCloud