summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [mips] Correct the definition of cvt.d.wSimon Dardis2018-02-201-3/+2
| | | | | | | | An upcoming patch D41434, changes the ordering of the matcher table for assembly. This patch corrects the definition of the normal MIPS cvt.d.w not to be available in microMIPS. llvm-svn: 325589
* [DEBUGINFO] Add support for emission of the inlined strings.Alexey Bataev2018-02-203-0/+21
| | | | | | | | | | | | | | Summary: Patch adds an option for emission of inlined strings rather than .debug_str section. Reviewers: echristo, jlebar Subscribers: eraman, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D43390 llvm-svn: 325583
* [PowerPC] Reduce stack frame for fastcc functions by only allocating ↵Lei Huang2018-02-201-2/+11
| | | | | | | | | | | | parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581
* [Hexagon] Fix alignment calculation of stack objects in Hexagon bit trackerKrzysztof Parzyszek2018-02-203-6/+6
| | | | llvm-svn: 325580
* [VectorLegalizer] Fix uint64_t typo in ExpandUINT_TO_FLOAT (PR36391)Simon Pilgrim2018-02-201-1/+1
| | | | | | | | ExpandUINT_TO_FLOAT can accept vXi32 or vXi64 inputs, so we need to use a uint64_t shift to generate the 2^(BW/2) constant. No test case unfortunately as no upstream target uses this, but its affecting a downstream target. llvm-svn: 325578
* [ARM] Mark -1 as cheap in xor's for thumb1David Green2018-02-201-0/+7
| | | | | | | | | | We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1 would not otherwise be considered a cheap constant. This prevents the -1's from being pulled out into constants and potentially hoisted. Differential Revision: https://reviews.llvm.org/D43451 llvm-svn: 325573
* [llvm-mc] - Produce R_X86_64_PLT32 for "call/jmp foo".George Rimar2018-02-206-2/+39
| | | | | | | | | | | For instructions like call foo and jmp foo patch changes relocation produced from R_X86_64_PC32 to R_X86_64_PLT32. Relocation can be used as a marker for 32-bit PC-relative branches. Linker will reduce PLT32 relocation to PC32 if function is defined locally. Differential revision: https://reviews.llvm.org/D43383 llvm-svn: 325569
* [AMDGPU] stop buffer_store being moved illegallyTim Renouf2018-02-201-6/+2
| | | | | | | | | | | | | | | Summary: The machine instruction scheduler was illegally moving a buffer store past a buffer load with the same descriptor and offset. Fixed by marking buffer ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D43332 Change-Id: Iff3173d9e0653e830474546276ab9d30318b8ef7 llvm-svn: 325567
* [MC] - Don't crash on unclosed frame.George Rimar2018-02-201-3/+4
| | | | | | | | | | | | | | | | llvm-mc can crash when there is cfi_startproc without cfi_end_proc: .text .globl foo foo: .cfi_startproc Testcase shows the issue, patch fixes it. Differential revision: https://reviews.llvm.org/D43456 llvm-svn: 325564
* [X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and ↵Craig Topper2018-02-202-9/+46
| | | | | | | | auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics. The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select. llvm-svn: 325559
* Report fatal error in the case of out of memorySerge Pavlov2018-02-209-18/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the second part of recommit of r325224. The previous part was committed in r325426, which deals with C++ memory allocation. Solution for C memory allocation involved functions `llvm::malloc` and similar. This was a fragile solution because it caused ambiguity errors in some cases. In this commit the new functions have names like `llvm::safe_malloc`. The relevant part of original comment is below, updated for new function names. Analysis of fails in the case of out of memory errors can be tricky on Windows. Such error emerges at the point where memory allocation function fails, but manifests itself when null pointer is used. These two points may be distant from each other. Besides, next runs may not exhibit allocation error. In some cases memory is allocated by a call to some of C allocation functions, malloc, calloc and realloc. They are used for interoperability with C code, when allocated object has variable size and when it is necessary to avoid call of constructors. In many calls the result is not checked for null pointer. To simplify checks, new functions are defined in the namespace 'llvm': `safe_malloc`, `safe_calloc` and `safe_realloc`. They behave as corresponding standard functions but produce fatal error if allocation fails. This change replaces the standard functions like 'malloc' in the cases when the result of the allocation function is not checked for null pointer. Finally, there are plain C code, that uses malloc and similar functions. If the result is not checked, assert statement is added. Differential Revision: https://reviews.llvm.org/D43010 llvm-svn: 325551
* [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to ↵Amara Emerson2018-02-201-4/+31
| | | | | | | | | | | | fpr32 first. This is a follow on commit to r[x] where we fix the other direction of copy. For this case, after converting the source from gpr32 -> fpr32, we use a subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land. https://reviews.llvm.org/D43444 llvm-svn: 325550
* [X86] Make XOP VPCOM instructions commutable to fold loads during isel.Craig Topper2018-02-203-52/+75
| | | | llvm-svn: 325547
* [X86] Make a helper function for commuting AVX512 VPCMP immediates since we ↵Craig Topper2018-02-203-24/+24
| | | | | | do it in two places. llvm-svn: 325546
* [InstCombine] use CreateWithCopiedFlags to reduce code; NFCISanjay Patel2018-02-191-7/+6
| | | | | | Also, move the folds with constants closer to make it easier to follow. llvm-svn: 325541
* Revert "[mem2reg] Use range loops (NFCI)"Brian Gesiak2018-02-191-8/+9
| | | | | | This reverts commit r325532. llvm-svn: 325539
* [X86] Use vpmovq2m/vpmovd2m for truncate to vXi1 when possible.Craig Topper2018-02-191-0/+4
| | | | | | Previously we used vptestmd, but the scheduling data for SKX says vpmovq2m/vpmovd2m is lower latency. We already used vpmovb2m/vpmovw2m for byte/word truncates. So this is more consistent anyway. llvm-svn: 325534
* [InstCombine] allow fdiv with constant dividend folds with less than full ↵Sanjay Patel2018-02-191-2/+3
| | | | | | | | | | | | -ffast-math It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this should be conservatively better than what we have right now. GCC allows this with only -freciprocal-math. The last test is changed to show a case that is expected to fold, but we need D43398. llvm-svn: 325533
* [mem2reg] Use range loops (NFCI)Brian Gesiak2018-02-191-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Several for loops in PromoteMemoryToRegister.cpp leave their increment expression empty, instead incrementing the iterator within the for loop body. I believe this is because these loops were previously implemented as while loops; see https://reviews.llvm.org/rL188327. Incrementing the iterator within the body of the for loop instead of in its increment expression makes it seem like the iterator will be modified or conditionally incremented within the loop, but that is not the case in these loops. Instead, use range loops. Test Plan: `check-llvm` Reviewers: davide, bkramer Reviewed By: davide, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43473 llvm-svn: 325532
* [InstCombine] refactor fdiv with constant dividend folds; NFCSanjay Patel2018-02-191-26/+27
| | | | | | | | | The last fold that used to be here was not necessary. That's a combination of 2 folds (and there's a regression test to show that). The transforms are guarded by isFast(), but that should be loosened. llvm-svn: 325531
* [Coroutines] Move debug statement before assertBrian Gesiak2018-02-191-1/+2
| | | | | | | | | | Summary: Move a debug statement to above where an assertion is hit, so that the debug statement can be inspected before a stack trace. Test Plan: `check-llvm` llvm-svn: 325529
* [X86] Stop swapping the operands of AVX512 setge.Craig Topper2018-02-191-2/+2
| | | | | | We swapped the operands and used setle, but I don't see any reason to do that. I think this is a holdover from SSE where we swap and the invert to use pcmpgt. But with AVX512 we don't want an invert so we won't use pcmpgt. So there's no need to swap. llvm-svn: 325527
* [X86] Reduce the number of isel pattern variations needed for ↵Craig Topper2018-02-192-16/+32
| | | | | | | | | | VPTESTM/VPTESTNM matching. Canonicalize EQ/NE PCMPM to have build vector all zeros on the RHS so we don't have to pattern match it in both locations. This significantly reduces the number of isel patterns needed since we also had to multiply it out with loads being in either operand of the 'and' input node and in the 'and' masking node. This removes over 24000 bytes from the isel table. llvm-svn: 325526
* bitcode support change for fast flags compatibilitySteven Wu2018-02-192-15/+17
| | | | | | | | | | | | | | | | Summary: The discussion and as per need, each vendor needs a way to keep the old fast flags and the new fast flags in the auto upgrade path of the IR upgrader. This revision addresses that issue. Patched by Michael Berg Reviewers: qcolombet, hans, steven_wu Reviewed By: qcolombet, steven_wu Subscribers: dexonsmith, vsk, mehdi_amini, andrewrk, MatzeB, wristow, spatel Differential Revision: https://reviews.llvm.org/D43253 llvm-svn: 325525
* [AMDGPU] Make note of existing waitcnt instrs; this is add-on work related ↵Mark Searles2018-02-191-18/+16
| | | | | | to suppression of redundant waitcnt instrs. It is necessary to make note of these existing waitcnt instrs so that we do not fall into an infinite loop when handling loops. Also, [NFC] some minor code clean-up. llvm-svn: 325524
* [SelectionDAG] ComputeKnownBits - add support for SMIN+SMAX clamp patternsSimon Pilgrim2018-02-191-5/+32
| | | | | | | | | | If we have a clamp pattern, SMIN(SMAX(X, LO),HI) or SMAX(SMIN(X, HI),LO) then we can deduce that the number of signbits (zeros/ones) will be at least the minimum of the LO and HI constants. ComputeKnownBits equivalent of D43338. Differential Revision: https://reviews.llvm.org/D43463 llvm-svn: 325521
* [AMDGPU] Increased vector length for global/constant loads.Mark Searles2018-02-192-2/+34
| | | | | | | | | | | | | | Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D43275 llvm-svn: 325518
* [CodeGen] Refactor AppleAccelTablePavel Labath2018-02-193-155/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This commit separates the abstract accelerator table data structure from the code for writing out an on-disk representation of a specific accelerator table format. The idea is that former (now called AccelTable<T>) can be reused for the DWARF v5 accelerator tables as-is, without any further customizations. Some bits of the emission code (now living in the EmissionContext class) can be reused for DWARF v5 as well, but the subtle differences in the layout of various subtables mean the sharing is not always possible. (Also, the individual emit*** functions are fairly simple so there's a tradeoff between making a bigger general-purpose function, and two smaller targeted functions.) Another advantage of this setup is that more of the serialization logic can be hidden in the .cpp file -- I have moved declarations of the header and all the emission functions there. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: echristo, clayborg, vleschuk, llvm-commits Differential Revision: https://reviews.llvm.org/D43285 llvm-svn: 325516
* Bring back r323297.Rafael Espindola2018-02-191-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | It was reverted because it broke the grub build. The reason the grub build broke is because grub does its own relocation processing and was not handing R_386_PLT32. Since grub has no dynamic linker, the fix is trivial: handle R_386_PLT32 exactly like R_386_PC32. On the report it was noted that they are using -fno-integrated-assembler. The upstream GAS (starting with 451875b4f976a527395e9303224c7881b65e12ed) will already be producing a R_386_PLT32 anyway, so they have to update their code one way or the other Original message: Don't assume a null GV is local for ELF and MachO. This is already a simplification, and should help with avoiding a plt reference when calling an intrinsic with -fno-plt. With this change we return false for null GVs, so the caller only needs to check the new metadata to decide if it should use foo@plt or *foo@got. llvm-svn: 325514
* [CodeGen] Fix tests breaking after r325505Francis Visoiu Mistrih2018-02-191-2/+0
| | | | llvm-svn: 325512
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-193-1/+32
| | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Third attempt - moved function from lambda to static function due to build failures. llvm-svn: 325506
* Revert "[CodeGen] Move printing '\n' from MachineInstr::print to ↵Francis Visoiu Mistrih2018-02-195-41/+24
| | | | | | | | MachineBasicBlock::print" This reverts commit r324681. llvm-svn: 325505
* [X86][SSE] combineTruncateWithSat - use truncateVectorWithPACK down to ↵Simon Pilgrim2018-02-191-13/+26
| | | | | | | | 64-bit subvectors Add support for chaining PACKSS/PACKUS down to 64-bit vectors by using only a single 128-bit input. llvm-svn: 325494
* [Transforms] Propagate new-format TBAA tags on simplification of ↵Ivan A. Kosarev2018-02-191-1/+3
| | | | | | | | | | | | | | memory-transfer intrinsics With this patch in place, when a new-format TBAA tag is available for a memory-transfer intrinsic call, we prefer propagating that new-format tag. Otherwise, we fallback to the old approach where we try to construct a proper TBAA access tag from 'tbaa.struct' metadata. Differential Revision: https://reviews.llvm.org/D41543 llvm-svn: 325488
* [llvm-opt-fuzzer] Add another pack of passes for continuous fuzzingIgor Laevsky2018-02-191-5/+25
| | | | | | Differential Revision: https://reviews.llvm.org/D43384 llvm-svn: 325487
* [AVR] Set the program address space in the data layoutDylan McKay2018-02-191-1/+1
| | | | | | | | | | | | This adds the program memory address space setting to the AVR data layout. This setting was very recently added under r325479. At the moment, there are no uses of this setting. In the future, things such as switch lookup tables should reside there. llvm-svn: 325481
* Add default address space for functions to the data layout (1/3)Dylan McKay2018-02-191-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds initial support for letting targets specify which address spaces their functions should reside in by default. If a function is created by a frontend, it will get the default address space specified in the DataLayout, unless the frontend explicitly uses a more general `llvm::Function` constructor. Function address spaces will become a part of the bitcode and textual IR forms, as we do not have access to a data layout whilst parsing LL. It will be possible to write IR that explicitly has `addrspace(n)` on a function. In this case, the function will reside in the specified space, ignoring the default in the DL. This is the first step towards placing functions into the correct address space for Harvard architectures. Full patchset * Add program address space to data layout D37052 * Require address space to be specified when creating functions D37054 * [clang] Require address space to be specified when creating functions D37057 Reviewers: pcc, arsenm, kparzysz, hfinkel, theraven Reviewed By: theraven Subscribers: arichardson, simoncook, rengolin, wdng, uabelho, bjope, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D37052 llvm-svn: 325479
* [AVR] Fix a lowering bug in AVRISelLowering.cppDylan McKay2018-02-191-4/+6
| | | | | | | | | | | | | | The parseFunctionArgs() method was directly reading the arguments from a Function object, but is should have used the arguments supplied by the SelectionDAGBuilder. This was causing the lowering code to only lower one argument, not two in some cases. Thanks to @brainlag on GitHub for coming up with the working fix! Patch-by: @brainlag on GitHub llvm-svn: 325474
* Add LanaiMCTargetDesc.h to LanaiInstrInfo.h to make it self containedEric Christopher2018-02-191-0/+1
| | | | | | with instruction enum definitions. llvm-svn: 325473
* [X86] Correct a typo I made in combineToExtendCMOV recently.Craig Topper2018-02-181-1/+1
| | | | | | | | We're accidentally checking that the same node is a constant twice instead of checking the other node. This isn't a functional problem since we didn't do anything below that explicitly requires constants. It just means we may have introduced a sign_extend or zero_extend that won't fold out. llvm-svn: 325469
* [PatternMatch, InstSimplify] enhance m_AllOnes() to ignore undef elements in ↵Sanjay Patel2018-02-181-11/+7
| | | | | | | | | | | | | | | | | | | vectors Loosening the matcher definition reveals a subtle bug in InstSimplify (we should not assume that because an operand constant matches that it's safe to return it as a result). So I'm making that change here too (that diff could be independent, but I'm not sure how to reveal it before the matcher change). This also seems like a good reason to *not* include matchers that capture the value. We don't want to encourage the potential misstep of propagating undef values when it's not allowed/intended. I didn't include the capture variant option here or in the related rL325437 (m_One), but it already exists for other constant matchers. llvm-svn: 325466
* Fix unused assertion variable warning.Amara Emerson2018-02-181-0/+1
| | | | llvm-svn: 325464
* [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copiedAmara Emerson2018-02-181-0/+25
| | | | | | | | | | | | to gpr register banks. PR36345. rdar://36478867 Differential Revision: https://reviews.llvm.org/D43310 llvm-svn: 325463
* [AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits.Amara Emerson2018-02-181-3/+19
| | | | | | These are needed for operations on fp16 types in a later patch. llvm-svn: 325462
* [Support] Replace hand-written scope_exit with make_scope_exit.Benjamin Kramer2018-02-181-23/+3
| | | | | | No functionality change intended. llvm-svn: 325460
* [AArch64] Coalesce Copy Zero during instruction selectionHaicheng Wu2018-02-181-1/+29
| | | | | | | | Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 llvm-svn: 325459
* [BPF] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-181-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Yonghong Song llvm-svn: 325457
* [X86] Make masked pcmpeq commutable during isel so we can fold loads in ↵Craig Topper2018-02-182-2/+4
| | | | | | | | | | other operand to the shorter encoding. Previously we used the immediate encoding if the load was in operand 0 and the short encoding if the load was in operand 1. This added an insane number of bytes to the size of the isel table. I'm wondering if we should always use the immediate form during isel and change to the short form during emission. This would remove the need to pattern match every combination for both the immediate form and the short form during isel. We could do the same with vpcmpgt llvm-svn: 325456
* Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries Simon Pilgrim2018-02-183-32/+1
| | | | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). Reverted due to buildbot failures llvm-svn: 325454
* Fix Wparentheses warning. NFCISimon Pilgrim2018-02-171-1/+1
| | | | llvm-svn: 325451
OpenPOWER on IntegriCloud