summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Add fract intrinsicMatt Arsenault2016-05-283-36/+21
| | | | | | | | | Remove broken patterns matching it. This was matching the unsafe math pattern and expanding the fix for the buggy instruction from the pattern. The problems are also on CI. Remove the workarounds and only use fract with unsafe math or from the intrinsic. llvm-svn: 271078
* Start using shouldAssumeDSOLocal on ARM.Rafael Espindola2016-05-271-29/+9
| | | | | | Given where this is used it should be a nop. llvm-svn: 271066
* AArch64Subtarget: Use default member initializersMatthias Braun2016-05-272-19/+15
| | | | llvm-svn: 271057
* Map DynamicNoPIC to Static on non-darwin.Rafael Espindola2016-05-271-0/+5
| | | | | | | DynamicNoPIC was only every used on darwin. This maps it to static on ELF. It matches what is done on X86. llvm-svn: 271052
* [Hexagon] Use standard macros to initialize HexagonExpandCondsets passKrzysztof Parzyszek2016-05-271-12/+7
| | | | llvm-svn: 271045
* [Hexagon] Do not create passes in the constructor of HexagonPassConfigKrzysztof Parzyszek2016-05-271-9/+5
| | | | | | | When running mir tests, a pass created in that constructor would not be freed, leading to memory leaks. llvm-svn: 271043
* [X86] Detect SAD patterns and emit psadbw instructions.Michael Kuperstein2016-05-271-0/+140
| | | | | | | | This recommits r267649 with a fix for PR27539. Differential Revision: http://reviews.llvm.org/D20598 llvm-svn: 271033
* [X86] Clarify PSHUFB+blend lowering function name. NFC.Ahmed Bougacha2016-05-271-9/+11
| | | | | | Also guard against v32i8 users. llvm-svn: 271024
* [ARM] Remove tBLXr Pat made redundant by r269101. NFCI.Ahmed Bougacha2016-05-272-10/+0
| | | | llvm-svn: 271023
* Use StringRef::startswith instead of find(...) == 0.Benjamin Kramer2016-05-271-1/+1
| | | | | | It's faster and easier to read. llvm-svn: 271018
* [sparc] Simplify a slow and verbose way of checking if a string starts with ↵Benjamin Kramer2016-05-271-6/+4
| | | | | | | | "ld". PR27904. llvm-svn: 271016
* Apply clang-tidy's misc-move-constructor-init throughout LLVM.Benjamin Kramer2016-05-272-3/+5
| | | | | | No functionality change intended, maybe a tiny performance improvement. llvm-svn: 270997
* [mips] Weaken asm predicate for memory offsetsSimon Dardis2016-05-271-3/+7
| | | | | | | | | | | | The isMemWithSimmOffset predicate rejects relocations which is incorrect behaviour. Linkers and other tools should handle|warn|error when the field overflows. Reviewers: dsanders, vkalintiris Differential Revision: http://reviews.llvm.org/D20727 llvm-svn: 270995
* [AMDGPU][llvm-mc] Square-braced-syntax for registers - make ":expr2" optional.Artem Tamazov2016-05-271-6/+10
| | | | | | | | | | | | | | | | | Register numbers may be specified as assembly-time expressions. This feature can be useful in macros and alike. However, expressions are supported within sqare braces only. Sqare braces were initially intended to support specifying of multiple (pairs/quads...) registers. Syntax like v[8:8] which specifies single register is also supported. That allows expressions but looks a bit unnatural. This change supports syntax REG[EXPR]. Tests added. Differential Revision: http://reviews.llvm.org/D20588 llvm-svn: 270990
* Avoid some copies by using const references.Benjamin Kramer2016-05-276-8/+5
| | | | | | | clang-tidy's performance-unnecessary-copy-initialization with some manual fixes. No functional changes intended. llvm-svn: 270988
* Apply clang-tidy's misc-static-assert where it makes sense.Benjamin Kramer2016-05-277-29/+26
| | | | | | | Also fold conditions into assert(0) where it makes sense. No functional change intended. llvm-svn: 270982
* [sparc] Remove some unused (and undefined) declarations.Benjamin Kramer2016-05-274-14/+2
| | | | | | No functionality change. llvm-svn: 270981
* [hexagon] Move BlockRanges and RDF stuff into the llvm namespace.Benjamin Kramer2016-05-2710-28/+28
| | | | | | No functional change intended. llvm-svn: 270980
* [sparc] Move LEON passes into llvm namespace.Benjamin Kramer2016-05-272-4/+6
| | | | | | Also give them library visiblity while there. llvm-svn: 270979
* Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer ↵Simon Pilgrim2016-05-271-0/+18
| | | | | | extension intrinsics with generic IR (llvm) llvm-svn: 270976
* [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with ↵Simon Pilgrim2016-05-271-18/+0
| | | | | | | | | | | | generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. A companion patch (D20684) removes/auto-upgrade the clang intrinsics. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 270973
* [Hexagon] Enable the post-RA schedulerKrzysztof Parzyszek2016-05-263-7/+91
| | | | | | | | | The aggressive anti-dependency breaker can rename the restored callee- saved registers. To prevent this, mark these registers are live on all paths to the return/tail-call instructions, and add implicit use operands for them to these instructions. llvm-svn: 270898
* [AArch64] Generate rev16/rev32 from bswap + srl when upper bits are known zero.Chad Rosier2016-05-261-1/+31
| | | | | | | | | | | | | | | | | Canonicalize (srl (bswap i32 x), 16) to (rotr (bswap i32 x), 16), if the high 16-bits of x are zero. Similarly, canonicalize (srl (bswap i64 x), 32) to (rotr (bswap i64 x), 32), if the high 32-bits of x are zero. test_rev_w_srl16: test_rev_w_srl16: and w8, w0, #0xffff and w8, w0, #0xffff rev w8, w8 ---> rev16 w0, w8 lsr w0, w8, #16 test_rev_x_srl32: test_rev_x_srl32: rev x8, x8 ---> rev32 x0, x8 lsr x0, x8, #32 llvm-svn: 270896
* AMDGPU/SI: Enable load-store-opt by default.Changpeng Fang2016-05-261-1/+1
| | | | | | | | | | Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 llvm-svn: 270894
* Init member structs in constructor.Artem Belevich2016-05-261-3/+9
| | | | | | | Fixes build error on windows where MSVC does not support list initialization inside member initializer list. llvm-svn: 270877
* [NVPTX] Added NVVMIntrRange passArtem Belevich2016-05-264-0/+159
| | | | | | | | | | | | NVVMIntrRange adds !range metadata to calls of NVVM intrinsics that return values within known limited range. This allows LLVM to generate optimal code for indexing arrays based on tid/ctaid which is a frequently used pattern in CUDA code. Differential Revision: http://reviews.llvm.org/D20644 llvm-svn: 270872
* [AMDGPU][llvm-mc] s_getreg/setreg* - hwreg - factor out strings/literals etc.Artem Tamazov2016-05-269-159/+196
| | | | | | | | | | | Hwreg(...) syntax implementation unified with sendmsg(...). Common strings moved to Utils MathExtras.h functionality utilized. Added missing build dependency in Disassembler. Differential Revision: http://reviews.llvm.org/D20381 llvm-svn: 270871
* Fix build warning introduced in r270552 "[AMDGPU][llvm-mc] Disassembler: ↵Artem Tamazov2016-05-261-1/+2
| | | | | | support for TTMP/TBA/TMA registers." llvm-svn: 270859
* [X86][SSE] When lowering a 256-bit shuffle as PMOVZX, reduce the input ↵Simon Pilgrim2016-05-261-1/+7
| | | | | | | | vector to the lower 128-bit subvector. Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads. llvm-svn: 270858
* [Hexagon] Select the aggressive anti-dependency breakerKrzysztof Parzyszek2016-05-261-0/+2
| | | | llvm-svn: 270857
* [AMDGPU] Remove exit-on-error flag from test (PR27762)Diana Picus2016-05-261-1/+1
| | | | | | | | | | Similar to r269948, but for argument lowering. Fixes PR27762 Differential Revision: http://reviews.llvm.org/D20430 llvm-svn: 270856
* [BPF] Remove exit-on-error flag in test (PR27767)Diana Picus2016-05-261-0/+1
| | | | | | | | | | | | The exit-on-error flag is needed to avoid an assert where llvm::SelectionDAGISel::LowerArguments doesn't create enough arguments. Fill up with zeroes to reach the right number of args. Fixes PR27767. Differential Revision: http://reviews.llvm.org/D20571 llvm-svn: 270855
* [AArch64] Generate a BFI/BFXIL from 'or (and X, MaskImm), OrImm'.Chad Rosier2016-05-261-1/+95
| | | | | | | | | | | | | | | | | | | | | | | If and only if the value being inserted sets only known zero bits. This combine transforms things like and w8, w0, #0xfffffff0 movz w9, #5 orr w0, w8, w9 into movz w8, #5 bfxil w0, w8, #0, #4 The combine is tuned to make sure we always reduce the number of instructions. We avoid churning code for what is expected to be performance neutral changes (e.g., converted AND+OR to OR+BFI). Differential Revision: http://reviews.llvm.org/D20387 llvm-svn: 270846
* Use shouldAssumeDSOLocal on AArch64.Rafael Espindola2016-05-262-62/+7
| | | | | | This reduces code duplication and now AArch64 also handles PIE. llvm-svn: 270844
* [AVX512] Fix intrinsic cmp{sd|ss} lowering.Igor Breger2016-05-261-3/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D20615 llvm-svn: 270843
* [Sparc] Extend the assembler printing support for Sparc back-end.Chris Dewhurst2016-05-261-0/+4
| | | | | | Allows display of floating-point registers and display of assembler meta-data output. llvm-svn: 270829
* [NVPTX] Don't (incorrectly) say that the NVVMReflect pass preserves all ↵Justin Lebar2016-05-251-3/+0
| | | | | | | | | | | | analyses. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D20585 llvm-svn: 270790
* Don't repeat name in comment and git-clang-format.Rafael Espindola2016-05-251-5/+5
| | | | llvm-svn: 270785
* Sort includes.Rafael Espindola2016-05-251-1/+1
| | | | llvm-svn: 270769
* Simplify std::all_of/any_of predicates by using llvm::all_of/any_of. NFCI.Simon Pilgrim2016-05-251-7/+5
| | | | llvm-svn: 270753
* Fix shouldAssumeDSOLocal for private linkage.Rafael Espindola2016-05-251-1/+1
| | | | llvm-svn: 270746
* AMDGPU: Fix v2i64/v2f64 bitcastsMatt Arsenault2016-05-251-0/+2
| | | | | | | These operations tend to get promoted away to v4i32 so this doesn't happen often. llvm-svn: 270740
* AMDGPU: Fix inconsistent lowering of select of vectorsMatt Arsenault2016-05-251-1/+9
| | | | | | | | | f32 vectors would use a sequence of BFI instructions instead of unrolled cmp + select. This was better in the case of a VALU select with SGPR inputs, but we don't have a way of dealing with that in the DAG. llvm-svn: 270731
* [x86] avoid code explosion from LoopVectorizer for gather loop (PR27826) Sanjay Patel2016-05-251-2/+10
| | | | | | | | | | | | | | By making pointer extraction from a vector more expensive in the cost model, we avoid the vectorization of a loop that is very likely to be memory-bound: https://llvm.org/bugs/show_bug.cgi?id=27826 There are still bugs related to this, so we may need a more general solution to avoid vectorizing obviously memory-bound loops when we don't have HW gather support. Differential Revision: http://reviews.llvm.org/D20601 llvm-svn: 270729
* [x86, AVX] allow explicit calls to VZERO* to modify state in ↵Sanjay Patel2016-05-251-6/+7
| | | | | | | | | | VZeroUpperInserter pass (PR27823) As noted in the review, there are still problems, so this doesn't the bug completely. Differential Revision: http://reviews.llvm.org/D20529 llvm-svn: 270718
* [X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion ↵Simon Pilgrim2016-05-251-29/+15
| | | | | | | | | | intrinsics with generic IR Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead. Differential Revision: http://reviews.llvm.org/D20568 llvm-svn: 270678
* [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a ↵Craig Topper2016-05-251-7/+0
| | | | | | long time. llvm-svn: 270677
* Soften assertion in AMDGPU emitPrologue.Nirav Dave2016-05-251-2/+3
| | | | | | | | | | | | | | [AMDGPU] emitPrologue looks for an unused unallocated SGPR that is not the scratch descriptor. Continue search if unused register found fails other requirements. Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20526 llvm-svn: 270646
* [WebAssembly] Put __stack_pointer in the offset field of loads and stores.Dan Gohman2016-05-241-10/+10
| | | | | | | | | | | | | | | | | | | | Instead of this: i32.const $push10=, __stack_pointer i32.load $push11=, 0($pop10) Emit this: i32.const $push10=, 0 i32.load $push11=, __stack_pointer($pop10) It's not currently clear which is better, though there's a chance the second form may be better at overall compression. We can revisit this when we have more data; for now it makes sense to make PEI consistent with isel. Differential Revision: http://reviews.llvm.org/D20411 llvm-svn: 270635
* [AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegsKonstantin Zhuravlyov2016-05-247-23/+25
| | | | | | Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
OpenPOWER on IntegriCloud