summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Use vpmovq2m/vpmovd2m for truncate to vXi1 when possible.Craig Topper2018-02-191-0/+4
| | | | | | Previously we used vptestmd, but the scheduling data for SKX says vpmovq2m/vpmovd2m is lower latency. We already used vpmovb2m/vpmovw2m for byte/word truncates. So this is more consistent anyway. llvm-svn: 325534
* [InstCombine] allow fdiv with constant dividend folds with less than full ↵Sanjay Patel2018-02-191-2/+3
| | | | | | | | | | | | -ffast-math It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this should be conservatively better than what we have right now. GCC allows this with only -freciprocal-math. The last test is changed to show a case that is expected to fold, but we need D43398. llvm-svn: 325533
* [mem2reg] Use range loops (NFCI)Brian Gesiak2018-02-191-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Several for loops in PromoteMemoryToRegister.cpp leave their increment expression empty, instead incrementing the iterator within the for loop body. I believe this is because these loops were previously implemented as while loops; see https://reviews.llvm.org/rL188327. Incrementing the iterator within the body of the for loop instead of in its increment expression makes it seem like the iterator will be modified or conditionally incremented within the loop, but that is not the case in these loops. Instead, use range loops. Test Plan: `check-llvm` Reviewers: davide, bkramer Reviewed By: davide, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43473 llvm-svn: 325532
* [InstCombine] refactor fdiv with constant dividend folds; NFCSanjay Patel2018-02-191-26/+27
| | | | | | | | | The last fold that used to be here was not necessary. That's a combination of 2 folds (and there's a regression test to show that). The transforms are guarded by isFast(), but that should be loosened. llvm-svn: 325531
* [Coroutines] Move debug statement before assertBrian Gesiak2018-02-191-1/+2
| | | | | | | | | | Summary: Move a debug statement to above where an assertion is hit, so that the debug statement can be inspected before a stack trace. Test Plan: `check-llvm` llvm-svn: 325529
* [X86] Stop swapping the operands of AVX512 setge.Craig Topper2018-02-191-2/+2
| | | | | | We swapped the operands and used setle, but I don't see any reason to do that. I think this is a holdover from SSE where we swap and the invert to use pcmpgt. But with AVX512 we don't want an invert so we won't use pcmpgt. So there's no need to swap. llvm-svn: 325527
* [X86] Reduce the number of isel pattern variations needed for ↵Craig Topper2018-02-192-16/+32
| | | | | | | | | | VPTESTM/VPTESTNM matching. Canonicalize EQ/NE PCMPM to have build vector all zeros on the RHS so we don't have to pattern match it in both locations. This significantly reduces the number of isel patterns needed since we also had to multiply it out with loads being in either operand of the 'and' input node and in the 'and' masking node. This removes over 24000 bytes from the isel table. llvm-svn: 325526
* bitcode support change for fast flags compatibilitySteven Wu2018-02-192-15/+17
| | | | | | | | | | | | | | | | Summary: The discussion and as per need, each vendor needs a way to keep the old fast flags and the new fast flags in the auto upgrade path of the IR upgrader. This revision addresses that issue. Patched by Michael Berg Reviewers: qcolombet, hans, steven_wu Reviewed By: qcolombet, steven_wu Subscribers: dexonsmith, vsk, mehdi_amini, andrewrk, MatzeB, wristow, spatel Differential Revision: https://reviews.llvm.org/D43253 llvm-svn: 325525
* [AMDGPU] Make note of existing waitcnt instrs; this is add-on work related ↵Mark Searles2018-02-191-18/+16
| | | | | | to suppression of redundant waitcnt instrs. It is necessary to make note of these existing waitcnt instrs so that we do not fall into an infinite loop when handling loops. Also, [NFC] some minor code clean-up. llvm-svn: 325524
* [SelectionDAG] ComputeKnownBits - add support for SMIN+SMAX clamp patternsSimon Pilgrim2018-02-191-5/+32
| | | | | | | | | | If we have a clamp pattern, SMIN(SMAX(X, LO),HI) or SMAX(SMIN(X, HI),LO) then we can deduce that the number of signbits (zeros/ones) will be at least the minimum of the LO and HI constants. ComputeKnownBits equivalent of D43338. Differential Revision: https://reviews.llvm.org/D43463 llvm-svn: 325521
* [AMDGPU] Increased vector length for global/constant loads.Mark Searles2018-02-192-2/+34
| | | | | | | | | | | | | | Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D43275 llvm-svn: 325518
* [CodeGen] Refactor AppleAccelTablePavel Labath2018-02-193-155/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This commit separates the abstract accelerator table data structure from the code for writing out an on-disk representation of a specific accelerator table format. The idea is that former (now called AccelTable<T>) can be reused for the DWARF v5 accelerator tables as-is, without any further customizations. Some bits of the emission code (now living in the EmissionContext class) can be reused for DWARF v5 as well, but the subtle differences in the layout of various subtables mean the sharing is not always possible. (Also, the individual emit*** functions are fairly simple so there's a tradeoff between making a bigger general-purpose function, and two smaller targeted functions.) Another advantage of this setup is that more of the serialization logic can be hidden in the .cpp file -- I have moved declarations of the header and all the emission functions there. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: echristo, clayborg, vleschuk, llvm-commits Differential Revision: https://reviews.llvm.org/D43285 llvm-svn: 325516
* Bring back r323297.Rafael Espindola2018-02-191-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | It was reverted because it broke the grub build. The reason the grub build broke is because grub does its own relocation processing and was not handing R_386_PLT32. Since grub has no dynamic linker, the fix is trivial: handle R_386_PLT32 exactly like R_386_PC32. On the report it was noted that they are using -fno-integrated-assembler. The upstream GAS (starting with 451875b4f976a527395e9303224c7881b65e12ed) will already be producing a R_386_PLT32 anyway, so they have to update their code one way or the other Original message: Don't assume a null GV is local for ELF and MachO. This is already a simplification, and should help with avoiding a plt reference when calling an intrinsic with -fno-plt. With this change we return false for null GVs, so the caller only needs to check the new metadata to decide if it should use foo@plt or *foo@got. llvm-svn: 325514
* [CodeGen] Fix tests breaking after r325505Francis Visoiu Mistrih2018-02-191-2/+0
| | | | llvm-svn: 325512
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-193-1/+32
| | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Third attempt - moved function from lambda to static function due to build failures. llvm-svn: 325506
* Revert "[CodeGen] Move printing '\n' from MachineInstr::print to ↵Francis Visoiu Mistrih2018-02-195-41/+24
| | | | | | | | MachineBasicBlock::print" This reverts commit r324681. llvm-svn: 325505
* [X86][SSE] combineTruncateWithSat - use truncateVectorWithPACK down to ↵Simon Pilgrim2018-02-191-13/+26
| | | | | | | | 64-bit subvectors Add support for chaining PACKSS/PACKUS down to 64-bit vectors by using only a single 128-bit input. llvm-svn: 325494
* [Transforms] Propagate new-format TBAA tags on simplification of ↵Ivan A. Kosarev2018-02-191-1/+3
| | | | | | | | | | | | | | memory-transfer intrinsics With this patch in place, when a new-format TBAA tag is available for a memory-transfer intrinsic call, we prefer propagating that new-format tag. Otherwise, we fallback to the old approach where we try to construct a proper TBAA access tag from 'tbaa.struct' metadata. Differential Revision: https://reviews.llvm.org/D41543 llvm-svn: 325488
* [llvm-opt-fuzzer] Add another pack of passes for continuous fuzzingIgor Laevsky2018-02-191-5/+25
| | | | | | Differential Revision: https://reviews.llvm.org/D43384 llvm-svn: 325487
* [AVR] Set the program address space in the data layoutDylan McKay2018-02-191-1/+1
| | | | | | | | | | | | This adds the program memory address space setting to the AVR data layout. This setting was very recently added under r325479. At the moment, there are no uses of this setting. In the future, things such as switch lookup tables should reside there. llvm-svn: 325481
* Add default address space for functions to the data layout (1/3)Dylan McKay2018-02-191-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds initial support for letting targets specify which address spaces their functions should reside in by default. If a function is created by a frontend, it will get the default address space specified in the DataLayout, unless the frontend explicitly uses a more general `llvm::Function` constructor. Function address spaces will become a part of the bitcode and textual IR forms, as we do not have access to a data layout whilst parsing LL. It will be possible to write IR that explicitly has `addrspace(n)` on a function. In this case, the function will reside in the specified space, ignoring the default in the DL. This is the first step towards placing functions into the correct address space for Harvard architectures. Full patchset * Add program address space to data layout D37052 * Require address space to be specified when creating functions D37054 * [clang] Require address space to be specified when creating functions D37057 Reviewers: pcc, arsenm, kparzysz, hfinkel, theraven Reviewed By: theraven Subscribers: arichardson, simoncook, rengolin, wdng, uabelho, bjope, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D37052 llvm-svn: 325479
* [AVR] Fix a lowering bug in AVRISelLowering.cppDylan McKay2018-02-191-4/+6
| | | | | | | | | | | | | | The parseFunctionArgs() method was directly reading the arguments from a Function object, but is should have used the arguments supplied by the SelectionDAGBuilder. This was causing the lowering code to only lower one argument, not two in some cases. Thanks to @brainlag on GitHub for coming up with the working fix! Patch-by: @brainlag on GitHub llvm-svn: 325474
* Add LanaiMCTargetDesc.h to LanaiInstrInfo.h to make it self containedEric Christopher2018-02-191-0/+1
| | | | | | with instruction enum definitions. llvm-svn: 325473
* [X86] Correct a typo I made in combineToExtendCMOV recently.Craig Topper2018-02-181-1/+1
| | | | | | | | We're accidentally checking that the same node is a constant twice instead of checking the other node. This isn't a functional problem since we didn't do anything below that explicitly requires constants. It just means we may have introduced a sign_extend or zero_extend that won't fold out. llvm-svn: 325469
* [PatternMatch, InstSimplify] enhance m_AllOnes() to ignore undef elements in ↵Sanjay Patel2018-02-181-11/+7
| | | | | | | | | | | | | | | | | | | vectors Loosening the matcher definition reveals a subtle bug in InstSimplify (we should not assume that because an operand constant matches that it's safe to return it as a result). So I'm making that change here too (that diff could be independent, but I'm not sure how to reveal it before the matcher change). This also seems like a good reason to *not* include matchers that capture the value. We don't want to encourage the potential misstep of propagating undef values when it's not allowed/intended. I didn't include the capture variant option here or in the related rL325437 (m_One), but it already exists for other constant matchers. llvm-svn: 325466
* Fix unused assertion variable warning.Amara Emerson2018-02-181-0/+1
| | | | llvm-svn: 325464
* [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copiedAmara Emerson2018-02-181-0/+25
| | | | | | | | | | | | to gpr register banks. PR36345. rdar://36478867 Differential Revision: https://reviews.llvm.org/D43310 llvm-svn: 325463
* [AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits.Amara Emerson2018-02-181-3/+19
| | | | | | These are needed for operations on fp16 types in a later patch. llvm-svn: 325462
* [Support] Replace hand-written scope_exit with make_scope_exit.Benjamin Kramer2018-02-181-23/+3
| | | | | | No functionality change intended. llvm-svn: 325460
* [AArch64] Coalesce Copy Zero during instruction selectionHaicheng Wu2018-02-181-1/+29
| | | | | | | | Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 llvm-svn: 325459
* [BPF] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-181-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Yonghong Song llvm-svn: 325457
* [X86] Make masked pcmpeq commutable during isel so we can fold loads in ↵Craig Topper2018-02-182-2/+4
| | | | | | | | | | other operand to the shorter encoding. Previously we used the immediate encoding if the load was in operand 0 and the short encoding if the load was in operand 1. This added an insane number of bytes to the size of the isel table. I'm wondering if we should always use the immediate form during isel and change to the short form during emission. This would remove the need to pattern match every combination for both the immediate form and the short form during isel. We could do the same with vpcmpgt llvm-svn: 325456
* Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries Simon Pilgrim2018-02-183-32/+1
| | | | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). Reverted due to buildbot failures llvm-svn: 325454
* Fix Wparentheses warning. NFCISimon Pilgrim2018-02-171-1/+1
| | | | llvm-svn: 325451
* [SelectionDAG] ComputeNumSignBits - add support for SMIN+SMAX clamp patternsSimon Pilgrim2018-02-171-1/+26
| | | | | | | | | | If we have a clamp pattern, SMIN(SMAX(X, LO),HI) or SMAX(SMIN(X, HI),LO) then we can deduce that the number of signbits will be at least the minimum of the LO and HI constants. I haven't bothered with the UMIN/UMAX equivalent as (1) we don't have any current use cases and (2) I wonder if we'd be better off immediately falling back for ComputeKnownBits for UMIN/UMAX which already has optimization patterns useful for unsigned cases. Differential Revision: https://reviews.llvm.org/D43338 llvm-svn: 325450
* [SelectionDAG] SimplifyDemandedVectorElts - add support for VECTOR_INSERT_ELTSimon Pilgrim2018-02-171-0/+34
| | | | | | Differential Revision: https://reviews.llvm.org/D43431 llvm-svn: 325449
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-173-1/+32
| | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). llvm-svn: 325448
* [MIPS][MSA] Convert vector integer min/max opcodes to use generic implementationSimon Pilgrim2018-02-174-98/+45
| | | | | | | | | | Found while investigating D43338 Simon^3 - the LLVM project needs more Simons. Differential Revision: https://reviews.llvm.org/D43433 llvm-svn: 325447
* [X86] Add 'sahf' to getHostCPUFeatures so -march=native will pick it up ↵Craig Topper2018-02-171-0/+1
| | | | | | | | | | | | | | | | correctly. Summary: We probably mostly get this right due to family/model/stepping mapping to CPU names. But we should detect it explicitly. Reviewers: RKSimon, echristo, dim, spatel Reviewed By: dim Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43418 llvm-svn: 325439
* [DebugInfo][FastISel] Fix dropping dbg.value()Sander de Smalen2018-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: https://llvm.org/PR36263 shows that when compiling at -O0 a dbg.value() instruction (that remains from an original dbg.declare()) is dropped by FastISel. Since FastISel selects instructions by iterating a basic block backwards, it drops the dbg.value if one of its operands is not yet instantiated by a previously selected instruction. Instead of calling 'lookUpRegForValue()' we can call 'getRegForValue()' instead that will insert a placeholder for the operand to be filled in when continuing the instruction selection. Reviewers: aprantl, dblaikie, probinson Reviewed By: aprantl Subscribers: llvm-commits, dstenb, JDevlieghere Differential Revision: https://reviews.llvm.org/D43386 llvm-svn: 325438
* [InstSimplify] move select undef cond fold with other constant cond folds; NFCISanjay Patel2018-02-171-20/+21
| | | | llvm-svn: 325434
* [AArch64] Implement dynamic stack probing for windowsMartin Storsjo2018-02-176-1/+87
| | | | | | | | | This makes sure that alloca() function calls properly probe the stack as needed. Differential Revision: https://reviews.llvm.org/D42356 llvm-svn: 325433
* Fix unused variable warning. NFCI.Simon Pilgrim2018-02-171-3/+2
| | | | | | We were casting to AArch64InstrInfo but only using it for static methods which some compilers complain about. llvm-svn: 325432
* [dwarfdump] Fix spurious verification errors for DW_AT_location attributesJonas Devlieghere2018-02-171-15/+20
| | | | | | | | | | | Verifying any DWARF file that is optimized and contains at least one tag with a DW_AT_location with a location list offset as a DW_AT_form_dataXXX results in dwarfdump spuriously claiming that the location list is invalid. Differential revision: https://reviews.llvm.org/D40199 llvm-svn: 325430
* [DAGCombiner] Remove simplifyShuffleMask - now handled more generally by ↵Simon Pilgrim2018-02-171-35/+0
| | | | | | SimplifyDemandedVectorElts. llvm-svn: 325429
* [DebugInfo] Removed assert on missing CountVarDIESander de Smalen2018-02-171-7/+2
| | | | | | | | | | | | | | | | | | | | Summary: The assert for a DISubrange's CountVarDIE to be available fails when the dbg.value() has been optimized away for any reason. Having the assert for that is a little heavy, so instead removing it now in favor of not generating the 'count' expression. Addresses http://llvm.org/PR36263 . Reviewers: aprantl, dblaikie, probinson Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits, dstenb Differential Revision: https://reviews.llvm.org/D43387 llvm-svn: 325427
* Report fatal error in the case of out of memorySerge Pavlov2018-02-171-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | This is partial recommit of r325224, reverted in 325227. The relevant part of original comment is below. Analysis of fails in the case of out of memory errors can be tricky on Windows. Such error emerges at the point where memory allocation function fails, but manifests itself when null pointer is used. These two points may be distant from each other. Besides, next runs may not exhibit allocation error. Usual programming practice does not require checking result of 'operator new' because it throws 'std::bad_alloc' in the case of allocation error. However, LLVM is usually built with exceptions turned off, so 'new' can return null pointer. This change installs custom new handler, which causes fatal error in the case of out of memory. The handler is installed automatically prior to call to 'main' during construction of a static object defined in 'lib/Support/ErrorHandling.cpp'. If the application does not use this file, the handler may be installed manually by a call to 'llvm::install_out_of_memory_new_handler', declared in 'include/llvm/Support/ErrorHandling.h". Differential Revision: https://reviews.llvm.org/D43010 llvm-svn: 325426
* [AMDGPU] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-171-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Stanislav Mekhanoshin, Tom Stellard. llvm-svn: 325425
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Quentin Colombet2018-02-172-210/+1
| | | | | | | | | | | | | | | | | This reverts commit r323991. This commit breaks target that don't model all the register constraints in TableGen. So far the workaround was to set the hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the cases. For instance, when mutating an instruction (like in the lowering of COPYs) the isRenamable flag is not properly updated. The same problem will happen when attaching machine operand from one instruction to another. Geoff Berry is working on a fix in https://reviews.llvm.org/D43042. llvm-svn: 325421
* [DAG, X86] Revert r324797, r324491, and r324359.Chandler Carruth2018-02-172-116/+242
| | | | | | | | | | | | Sadly, r324359 caused at least PR36312. There is a patch out for review but it seems to be taking a bit and we've already had these crashers in tree for too long. We're hitting this PR in real code now and are blocked on shipping new compilers as a consequence so I'm reverting us back to green. Sorry for the churn due to the stacked changes that I had to revert. =/ llvm-svn: 325420
OpenPOWER on IntegriCloud