summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* GlobalISel: Partially implement widenScalar for MERGE_VALUESMatt Arsenault2019-01-291-7/+8
| | | | llvm-svn: 352560
* [AArch64][GlobalISel] Unmerge into scalars from a vector should use FPR bank.Amara Emerson2019-01-291-1/+5
| | | | | | | | | This currently shows up as a selection fallback since the dest regs were given GPR banks but the source was a vector FPR reg. Differential Revision: https://reviews.llvm.org/D57408 llvm-svn: 352545
* GlobalISel: Fix narrowScalar for load/store with different mem sizeMatt Arsenault2019-01-291-2/+22
| | | | | | | | | | This was ignoring the memory size, and producing multiple loads/stores if the operand size was different from the memory size. I assume this is the intent of not having an explicit G_ANYEXTLOAD (although I think that would probably be better). llvm-svn: 352523
* [X86][Btver2] Improved latency/throughput model for scalar int-to-float ↵Andrea Di Biagio2019-01-293-14/+16
| | | | | | | | | | | | | | conversions. Account for bypass delays when computing the latency of scalar int-to-float conversions. On Jaguar we need to account for an extra 6cy latency (see AMD fam16h SOG). This patch also fixes the number of micropcodes for the register-memory variants of scalar int-to-float conversions. Differential Revision: https://reviews.llvm.org/D57148 llvm-svn: 352518
* Adjust documentation for git migration.James Y Knight2019-01-291-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes most references to the paths: llvm.org/svn/ llvm.org/git/ llvm.org/viewvc/ github.com/llvm-mirror/ github.com/llvm-project/ reviews.llvm.org/diffusion/ to instead point to https://github.com/llvm/llvm-project. This is *not* a trivial substitution, because additionally, all the checkout instructions had to be migrated to instruct users on how to use the monorepo layout, setting LLVM_ENABLE_PROJECTS instead of checking out various projects into various subdirectories. I've attempted to not change any scripts here, only documentation. The scripts will have to be addressed separately. Additionally, I've deleted one document which appeared to be outdated and unneeded: lldb/docs/building-with-debug-llvm.txt Differential Revision: https://reviews.llvm.org/D57330 llvm-svn: 352514
* [AMDGPU] Fix a weird WWM intrinsic issue.Neil Henning2019-01-292-17/+17
| | | | | | | | | | | I found a really strange WWM issue through a very convoluted shader that essentially boils down to a bug in SIInstrInfo where canReadVGPR did not correctly identify that WWM is like a copy and can have a VGPR as its source. Differential Revision: https://reviews.llvm.org/D56002 llvm-svn: 352500
* [WebAssembly] Re-enable main-function signature rewritingDan Gohman2019-01-291-12/+24
| | | | | | | | | | | | | | | | Re-enable the code to rewrite main-function signatures into "int main(int argc, char *argv[])", but limited to only handling the case of "int main(void)", so that it doesn't silently strip an argument in the "int main(int argc, char *argv[], char *envp[])" case. This allows main to be called by C startup code, since WebAssembly requires caller and callee signatures to match, so it can't rely on passing main a different number of arguments than it expects. Differential Revision: https://reviews.llvm.org/D57323 llvm-svn: 352479
* [ARM] Use sub for negative offset load/store in thumb1David Green2019-01-292-6/+47
| | | | | | | | | | | This attempts to optimise negative values used in load/store operands a little. We currently try to selct them as rr, materialising the negative constant using a MOV/MVN pair. This instead selects ri with an immediate of 0, forcing the add node to become a simpler sub. Differential Revision: https://reviews.llvm.org/D57121 llvm-svn: 352475
* [COFF, ARM64] Don't put jump table into a separate COFF section for ↵Martin Storsjo2019-01-292-4/+13
| | | | | | | | | | | | | | | | | | | | | | EK_LabelDifference32 Windows ARM64 has PIC relocation model and uses jump table kind EK_LabelDifference32. This produces jump table entry as ".word LBB123 - LJTI1_2" which represents the distance between the block and jump table. A new relocation type (IMAGE_REL_ARM64_REL32) is needed to do the fixup correctly if they are in different COFF section. This change saves the jump table to the same COFF section as the associated code. An ideal fix could be utilizing IMAGE_REL_ARM64_REL32 relocation type. Patch by Tom Tan! Differential Revision: https://reviews.llvm.org/D57277 llvm-svn: 352465
* Fix compiler warning when using clang 3.6.0Mikael Holmen2019-01-291-1/+1
| | | | | | | | | | | | Without the fix we get the following (with -Werror): ../lib/Target/X86/X86ISelLowering.cpp:14181:58: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] SmallVector<std::array<int, 2>, 2> LaneSrcs(NumLanes, {-1, -1}); ^~~~~~ { } 1 error generated. llvm-svn: 352455
* [WebAssembly] Handle more types of uses in WebAssemblyAddMissingPrototypesSam Clegg2019-01-291-37/+10
| | | | | | | | | | | | | | Previously we were only handling bitcast operations, however prototypeless functions can also appear in other places such as comparisons and as function params. Switch to using replaceAllUsesWith() to replace the prototype-less function uses. This new approach results in some redundant bitcasting but is much simpler and handles all cases. Differential Revision: https://reviews.llvm.org/D56938 llvm-svn: 352445
* [PPC] Include tablegenerated PPCGenCallingConv.inc onceReid Kleckner2019-01-297-147/+137
| | | | | | | | | Move the CC analysis implementation to its own .cpp file instead of duplicating it and artificually using functions in PPCISelLowering.cpp and PPCFastISel.cpp. Follow-up to the same change done for X86, ARM, and AArch64. llvm-svn: 352444
* [WebAssembly] Expand BUILD_PAIR nodesThomas Lively2019-01-281-0/+3
| | | | | | | | | | Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57276 llvm-svn: 352442
* Recommit r352255 "[SelectionDAG][X86] Don't use SEXTLOAD for promoting ↵Craig Topper2019-01-281-3/+15
| | | | | | | | | | | | | | | | masked loads in the type legalizer" This did not cause the buildbot failure it was previously reverted for. Original commit message: I'm not sure why we were using SEXTLOAD. EXTLOAD seems more appropriate since we don't care about the upper bits. This patch changes this and then modifies the X86 post legalization combine to emit a extending shuffle instead of a sign_extend_vector_inreg. Could maybe use an any_extend_vector_inre On AVX512 targets I think we might be able to use a masked vpmovzx and not have to expand this at all. llvm-svn: 352433
* [ARM] Deduplicate table generated CC analysis codeReid Kleckner2019-01-286-275/+324
| | | | | | | Create ARMCallingConv.cpp and emit code for calling convention analysis from there. llvm-svn: 352431
* [AArch64] Include AArch64GenCallingConv.inc onceReid Kleckner2019-01-286-129/+171
| | | | | | | | | | | | | | | | | | | | Summary: Avoids duplicating generated static helpers for calling convention analysis. This also means you can modify AArch64CallingConv.td without recompiling the AArch64ISelLowering.cpp monolith, so it provides faster incremental rebuilds. Saves 12K in llc.exe, but adds a new object file, which is large. Reviewers: efriedma, t.p.northover Subscribers: mgorny, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56948 llvm-svn: 352430
* [GlobalISel][AArch64] Add legalization for G_FLOGJessica Paquette2019-01-282-1/+2
| | | | | | | | | | This adds support for legalizing G_FLOG into a RTLib call. It adds a legalizer test, and updates the existing floating point tests. https://reviews.llvm.org/D57347 llvm-svn: 352429
* AMDGPU: Add DS append/consume intrinsicsMatt Arsenault2019-01-284-17/+94
| | | | | | | | | | | | | | | Since these pass the pointer in m0 unlike other DS instructions, these need to worry about whether the address is uniform or not. This assumes the address is dynamically uniform, and just uses readfirstlane to get a copy into an SGPR. I don't know if these have the same 16-bit add for the addressing mode offset problem on SI or not, but I've just assumed they do. Also includes some misc. changes to avoid test differences between the LDS and GDS versions. llvm-svn: 352422
* [GlobalISel][AArch64] Add instruction selection support for @llvm.log10Jessica Paquette2019-01-282-1/+2
| | | | | | | | | | This adds instruction selection support for @llvm.log10 in AArch64. It teaches GISel to lower it to a library call, updates the relevant tests, and adds a legalizer test for log10. https://reviews.llvm.org/D57341 llvm-svn: 352418
* [AArch64] Add 'apple-latest' CPU aliasFrancis Visoiu Mistrih2019-01-281-0/+3
| | | | | | | | | | | | | | The 'apple-latest' alias is supposed to provide a CPU that contains the latest Apple processor model supported by LLVM. This is supposed to be used by tools like lldb to provide a target that supports most of the CPU features. For now, this is mapped to Cyclone. Differential Revision: https://reviews.llvm.org/D56384 llvm-svn: 352412
* [CodeGen][X86] Expand UADDSAT to NOT+UMIN+ADDNikita Popov2019-01-281-0/+7
| | | | | | | | | Followup to D56636, this time handling the UADDSAT case by expanding uadd.sat(a, b) to umin(a, ~b) + b. Differential Revision: https://reviews.llvm.org/D56869 llvm-svn: 352409
* [GlobalISel][AArch64] Add instruction selection support for G_FCOS and G_FSINJessica Paquette2019-01-282-1/+14
| | | | | | | | | | | | This contains all of the legalizer changes from D57197 necessary to select G_FCOS and G_FSIN. It also updates several existing IR tests in test/CodeGen/AArch64 that verify that we correctly lower the G_FCOS and G_FSIN instructions. https://reviews.llvm.org/D57197 3/3 llvm-svn: 352402
* [X86][AVX] Remove lowerShuffleByMerging128BitLanes 2-lane restrictionSimon Pilgrim2019-01-281-7/+10
| | | | | | | | First step towards adding support for 64-bit unary "sublane" handling (a bit like lowerShuffleAsRepeatedMaskAndLanePermute). This allows us to add lowerV64I8Shuffle handling. llvm-svn: 352389
* [x86] allow more shuffle splitting to avoid vpermps (PR40434)Sanjay Patel2019-01-281-1/+3
| | | | | | | | | | | | | | | This is tricky to make optimal: sometimes we're better off using a single wider op, but other times it makes more sense to combine a narrow ops to achieve the same result. This solves the case from: https://bugs.llvm.org/show_bug.cgi?id=40434 There's potentially a similar change for vectors with 64-bit elements, but it needs adjustments similar to rL352333 to avoid creating infinite loops. llvm-svn: 352380
* Remove no longer needed Arm specific LICENSE.TXT file.Arnaud A. de Grandmaison2019-01-281-47/+0
| | | | | | | | | As the codebase is now under the Apache 2.0 license with LLVM Exceptions, and all Arm's contributions, past or future, are under that new license, this Arm specific LICENSE.TXT is no longer needed, thus removing it. llvm-svn: 352376
* [mips] Support for +abs2008 attributeAleksandar Beserminji2019-01-287-5/+91
| | | | | | | | | | | | | | | | Instruction abs.[ds] is not generating correct result when working with NaNs for revisions prior mips32r6 and mips64r6. To generate a sequence which always produce a correct result, but also to allow user more control on how his code is compiled, attribute +abs2008 is added, so user can choose legacy or 2008. By default legacy mode is used on revisions prior R6. Mips32r6 and mips64r6 use abs2008 mode by default. Differential Revision: https://reviews.llvm.org/D35983 llvm-svn: 352370
* [AMDGPU] Add intrinsics for 16 bit interpolationTim Corringham2019-01-286-3/+96
| | | | | | | | | | | | | | | | | | | Summary: Added the intrinsics llvm.amdgcn.interp.p1.f16() and llvm.amdgcn.interp.p2.f16() and related LIT test. The p1 intrinsic generates code appropriate for both 16 and 32 bank LDS. Reviewers: #amdgpu, dstuttard, arsenm, tpr Reviewed By: #amdgpu, arsenm Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46754 llvm-svn: 352357
* [MIPS GlobalISel] Select subPetar Avramovic2019-01-282-2/+4
| | | | | | | | | Lower G_USUBO and G_USUBE. Add narrowScalar for G_SUB. Legalize and select G_SUB for MIPS 32. Differential Revision: https://reviews.llvm.org/D53416 llvm-svn: 352351
* [ARM GlobalISel] Support integer division for Thumb2Diana Picus2019-01-281-19/+21
| | | | | | | | | Support G_SDIV, G_UDIV, G_SREM and G_UREM. The only significant difference between arm and thumb mode is that we need to check a different subtarget feature. llvm-svn: 352346
* [X86] Add new variadic avx512 compress/expand intrinsics that use vXi1 types ↵Craig Topper2019-01-282-74/+3
| | | | | | | | for the mask argument. Remove and autoupgrade the old intrinsics llvm-svn: 352343
* [AArch64][GlobalISel] Teach RBS about G_FNEG default mapping.Amara Emerson2019-01-281-0/+1
| | | | llvm-svn: 352340
* [AArch64][GlobalISel] Add some missing vector support for FP arithmetic ops.Amara Emerson2019-01-281-2/+2
| | | | | | | Moved the fneg lowering legalization test from AArch64 to X86, as we want to specify that it's already legal. llvm-svn: 352338
* [AArch64][GlobalISel] Add some vector support for fp <-> int conversions.Amara Emerson2019-01-282-2/+6
| | | | | | Some unrelated, but benign, test changes as well due to the test update script. llvm-svn: 352337
* [x86] add restriction for lowering to vpermpsSanjay Patel2019-01-271-2/+19
| | | | | | | | | This transform was added with rL351346, and we had an escape for shufps, but we also want one for unpckps vs. vpermps because vpermps doesn't take an immediate shuffle index operand. llvm-svn: 352333
* [X86][SSE] Add UNDEF handling to combineSelect ISD::USUBSAT matching (PR40083)Simon Pilgrim2019-01-271-5/+7
| | | | llvm-svn: 352330
* [X86][SSE] Permit UNDEFs in combineAddToSUBUS matching (PR40083)Simon Pilgrim2019-01-271-3/+4
| | | | llvm-svn: 352328
* [x86] refactor logic in lowerShuffleWithUndefHalfSanjay Patel2019-01-271-28/+49
| | | | | | | | Although this is longer code, this is no-functional-change-intended. The goal is to untangle the conditions under which we bail out, so that's easier to adjust. llvm-svn: 352320
* [X86] Add some missing blsr patternsGabor Buella2019-01-271-2/+10
| | | | | | | | | | | | | | | | | | The add+and sequence followed by a branch can happen e.g. when looping over the set bits of an integer: ``` while (x != 0) { func(x & ~x); x &= x - 1; } ``` Reviewed By: ctopper Differential Revision: https://reviews.llvm.org/D57296 llvm-svn: 352306
* [X86] Add a pattern for (i64 (and (anyext def32:), 0x00000000FFFFFFFF)) to ↵Craig Topper2019-01-271-0/+2
| | | | | | | | | | produce SUBREG_TO_REG def32 here means the producing instruction zeroed bits 63:32. We already do this for zext, but it looks like we can get an and+anyext sometimes. Spotted in the diffs from D33587. llvm-svn: 352303
* GlobalISel: Implement narrowScalar for mulMatt Arsenault2019-01-271-0/+1
| | | | llvm-svn: 352300
* GlobalISel: fewerElementsVector for intrinsic_trunc/intrinsic_roundMatt Arsenault2019-01-271-1/+2
| | | | llvm-svn: 352298
* AMDGPU/GlobalISel: Use scalarize instead of clampMaxNumElementsMatt Arsenault2019-01-261-2/+1
| | | | llvm-svn: 352297
* AMDGPU/GlobalISel: Legalize more bit opsMatt Arsenault2019-01-261-4/+7
| | | | llvm-svn: 352295
* AMDGPU/GlobalISel: Widen small uaddo/usuboMatt Arsenault2019-01-261-1/+2
| | | | llvm-svn: 352294
* [X86] combineAddOrSubToADCOrSBB/combineCarryThroughADD - use oneuse for ↵Simon Pilgrim2019-01-261-2/+3
| | | | | | | | | | entire SDNode Fix issue noted in D57281 that only tested the one use for the SDValue (the result flag), not the entire SUB. I've added the getNode() to make it clearer what is intended than just the -> redirection. llvm-svn: 352291
* [X86] combineCarryThroughADD - add support for X86::COND_A commutations ↵Simon Pilgrim2019-01-261-6/+25
| | | | | | | | | | (PR24545) As discussed on PR24545, we should try to commute X86::COND_A 'icmp ugt' cases to X86::COND_B 'icmp ult' to more optimally bind the carry flag output to a SBB instruction. Differential Revision: https://reviews.llvm.org/D57281 llvm-svn: 352289
* [X86] Fold X86ISD::SBB(ISD::SUB(X,Y),0) -> X86ISD::SBB(X,Y) (PR25858)Simon Pilgrim2019-01-261-0/+9
| | | | | | | | | | We often generate X86ISD::SBB(X, 0) for carry flag arithmetic. I had tried to create test cases for the ADC equivalent (which often uses the same pattern) but haven't managed to find anything yet. Differential Revision: https://reviews.llvm.org/D57169 llvm-svn: 352288
* [X86][SSE] Generalized unsigned compares to support nonsplat constant ↵Simon Pilgrim2019-01-261-7/+10
| | | | | | vectors (PR39859) llvm-svn: 352283
* [x86] add helper for creating a half-width shuffle; NFCSanjay Patel2019-01-261-28/+39
| | | | | | | | | | This reduces a bit of duplication between the combining and lowering places that use it, but the primary motivation is to make it easier to rearrange the lowering logic and solve PR40434: https://bugs.llvm.org/show_bug.cgi?id=40434 llvm-svn: 352280
* [X86] Remove and autoupgrade vpconflict intrinsics that take a mask and ↵Craig Topper2019-01-261-12/+0
| | | | | | | | passthru argument. We have unmasked versions as of r352172 llvm-svn: 352270
OpenPOWER on IntegriCloud