summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [GlobalISel] Translate insertelement and extractelementVolkan Keles2017-03-102-0/+70
| | | | | | | | | | | | Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30761 llvm-svn: 297495
* NewGVN: Rename InitialClass to TOP, which is what most people would expect ↵Daniel Berlin2017-03-101-25/+25
| | | | | | it to be called llvm-svn: 297494
* [SLP] Revert everything that has to do with memory access sorting.Michael Kuperstein2017-03-102-173/+57
| | | | | | | | | | | | This reverts r293386, r294027, r294029 and r296411. Turns out the SLP tree isn't actually a "tree" and we don't handle accessing the same packet of loads in several different orders well, causing miscompiles. Revert until we can fix this properly. llvm-svn: 297493
* [SelectionDAG] Add support for BUILD_VECTOR to ComputeNumSignBitsSimon Pilgrim2017-03-101-0/+6
| | | | llvm-svn: 297492
* [GlobalISel] Make LegalizerInfo accessible in LegalizerHelperVolkan Keles2017-03-102-11/+8
| | | | | | | | | | | | | | | | | | | | Summary: We don’t actually use LegalizerInfo in Legalizer pass, it’s just passed as an argument. In order to check if an instruction is legal or not, we need to get LegalizerInfo by calling `MI.getParent()->getParent()->getSubtarget().getLegalizerInfo()`. Instead, make LegalizerInfo accessible in LegalizerHelper. Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, kristof.beyls Reviewed By: qcolombet Subscribers: dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D30838 llvm-svn: 297491
* [Support] Don't return an error if realPath fails.Zachary Turner2017-03-101-3/+2
| | | | | | | | | | | In openFileForRead, we would not previously return an error if real_path resolution failed. After a recent patch, we started propagating this error up. This caused a failure in clang when trying to call openFileForRead("nul"). This patch restores the previous behavior of not propagating this error up. llvm-svn: 297488
* Add llvm::sys::fs::real_path.Zachary Turner2017-03-102-16/+161
| | | | | | | | | | | | | | | | | | | | | LLVM already has real_path like functionality, but it is cumbersome to use and involves clean up after (e.g. you have to call openFileForRead, then close the resulting FD). Furthermore, on Windows it doesn't work for directories since opening a directory and opening a file require slightly different flags. So I add a simple function `real_path` which works for all paths on all platforms and has a simple to use interface. In doing so, I add the ability to opt in to resolving tilde expressions (e.g. ~/foo), which are normally handled by the shell. Differential Revision: https://reviews.llvm.org/D30668 llvm-svn: 297483
* [SelectionDAG] Make SelectionDAG aware of the known bits in USUBO and SSUBO ↵Amaury Sechet2017-03-101-4/+13
| | | | | | | | | | | | | | | | | and SUBC. Summary: Depends on D30379 This improves the state of things for the sub class of operation. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30436 llvm-svn: 297482
* [SelectionDAG] Make SelectionDAG aware of the known bits in UADDO and SADDO.Amaury Sechet2017-03-101-13/+37
| | | | | | | | | | | | Summary: As per title. This is extracted from D29872 and I threw SADDO in. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30379 llvm-svn: 297479
* [APInt] Add APInt::insertBits() method to insert an APInt into a larger APIntSimon Pilgrim2017-03-104-10/+67
| | | | | | | | | | | | We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages, resulting in further temporary variables, all of which require the allocation of memory for large APInts (MaskSizeInBits > 64). This is another of the compile time issues identified in PR32037 (see also D30265). This patch adds the APInt::insertBits() helper method which avoids the temporary memory allocation and masks/inserts the raw bits directly into the target. Differential Revision: https://reviews.llvm.org/D30780 llvm-svn: 297458
* [mips][msa] Accept more values for constant splatsSimon Dardis2017-03-102-11/+233
| | | | | | | | | | | | | | | | | This patches teaches the MIPS backend to accept more values for constant splats. Previously, only 10 bit signed immediates or values that could be loaded using an ldi.[bhwd] instruction would be acceptted. This patch relaxes that constraint so that any constant value that be splatted is accepted. As a result, the constant pool is used less for vector operations, and the suite of bit manipulation instructions b(clr|set|neg)i can now be used with the full range of their immediate operand. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D30640 llvm-svn: 297457
* imm_comp_XFORM (defined in ARMInstrThumb.td) duplicates imm_not_XFORM ↵Artyom Skrobov2017-03-102-7/+2
| | | | | | | | | | | | | | (defined in ARMInstrInfo.td) Reviewers: grosbach, rengolin, jmolloy Reviewed By: jmolloy Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D30782 llvm-svn: 297456
* [Assembler] Add location info to unary expressions.Sanne Wouda2017-03-102-6/+6
| | | | | | | | | | | | | | | | | Summary: This is a continuation of D28861. Add an SMLoc to MCUnaryExpr such that a better diagnostic can be given in case of an error in later stages of assembling. Reviewers: rengolin, grosbach, javed.absar, olista01 Reviewed By: olista01 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30581 llvm-svn: 297454
* Refactor the multiply-accumulate combines to act onArtyom Skrobov2017-03-102-108/+64
| | | | | | | | | | | | | | | | | ARMISD::ADD[CE] nodes, instead of the generic ISD::ADD[CE]. Summary: This allows for some simplification because the combines are no longer limited to just one go at the node before it gets legalized into an ARM target-specific one. Reviewers: jmolloy, rogfer01 Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30401 llvm-svn: 297453
* WholeProgramDevirt: Fixed compilation error under MSVS2015.George Rimar2017-03-101-9/+18
| | | | | | | | | | | | | | | | | | | | | | It was introduced in: r296945 WholeProgramDevirt: Implement exporting for single-impl devirtualization. --------------------- r296939 WholeProgramDevirt: Add any unsuccessful llvm.type.checked.load devirtualizations to the list of llvm.type.test users. --------------------- Microsoft Visual Studio Community 2015 Version 14.0.23107.0 D14REL Does not compile that code without additional brackets, showing multiple error like below: WholeProgramDevirt.cpp(1216): error C2958: the left bracket '[' found at 'c:\access_softek\llvm\lib\transforms\ipo\wholeprogramdevirt.cpp(1216)' was not matched correctly WholeProgramDevirt.cpp(1216): error C2143: syntax error: missing ']' before '}' WholeProgramDevirt.cpp(1216): error C2143: syntax error: missing ';' before '}' WholeProgramDevirt.cpp(1216): error C2059: syntax error: ']' llvm-svn: 297451
* [MC] Set SHT_MIPS_DWARF section type for all .debug_* sections on MIPSSimon Atanasyan2017-03-102-23/+36
| | | | | | | | | | | | All MIPS .debug_* sections should be marked with ELF type SHT_MIPS_DWARF accordingly the specification [1]. Also the same section type is assigned to these sections by GNU tools. [1] ftp.software.ibm.com/software/os390/czos/dwarf/mips_extensions.pdf Differential Revision: https://reviews.llvm.org/D29789 llvm-svn: 297447
* [MC] Accept a numeric value as an ELF section header's typeSimon Atanasyan2017-03-102-2/+9
| | | | | | | | | | | | | | | | | | | | | | GAS supports specification of section header's type using a numeric value [1]. This patch brings the same functionality to LLVM. That allows to setup some target-specific section types belong to the SHT_LOPROC - SHT_HIPROC range. If we attempt to print unknown section type, MCSectionELF class shows an error message. It's better than print sole '@' sign without any section type name. In case of MIPS, example of such section's type is SHT_MIPS_DWARF. Without the patch we will have to implement some workarounds in probably not-MIPS-specific part of code base to convert SHT_MIPS_DWARF to the @progbits while printing assembly and to assign SHT_MIPS_DWARF for @progbits sections named .debug_* if we encounter such section in an input assembly. [1] https://sourceware.org/binutils/docs/as/Section.html Differential Revision: https://reviews.llvm.org/D29719 llvm-svn: 297446
* For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,Artyom Skrobov2017-03-103-28/+151
| | | | | | | | | | | | same as already done for ARM and Thumb2. Reviewers: jmolloy, rogfer01, efriedma Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30400 llvm-svn: 297443
* Implement getPassName() for IR printing passes.Yaron Keren2017-03-104-0/+12
| | | | llvm-svn: 297442
* AMDGPU: Fix insertion point when reducing load intrinsicsMatt Arsenault2017-03-101-0/+3
| | | | | | | The insertion point may be later than the next instruction, so it is necessary to set it when replacing the call. llvm-svn: 297439
* Move memory coercion functions from GVN.cpp to VNCoercion.cpp so they can be ↵Daniel Berlin2017-03-103-447/+460
| | | | | | | | | | | | | | | | | | | shared between GVN and NewGVN. Summary: These are the functions used to determine when values of loads can be extracted from stores, etc, and to perform the necessary insertions to do this. There are no changes to the functions themselves except reformatting, and one case where memdep was informed of a removed load (which was pushed into the caller). Reviewers: davide Subscribers: mgorny, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30478 llvm-svn: 297438
* Do not use branch metadata to check if a basic block is hot.Dehao Chen2017-03-101-12/+1
| | | | | | | | | | | | | | Summary: We should not use that to check basic block hotness as optimization may mess it up. Reviewers: eraman Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30800 llvm-svn: 297437
* NewGVN: Rewrite DCE during elimination so we do it as well as old GVN did.Daniel Berlin2017-03-102-53/+109
| | | | llvm-svn: 297428
* NewGVN: Rename a few things for clarityDaniel Berlin2017-03-101-40/+45
| | | | llvm-svn: 297427
* [GlobalISel] Use ImmutableCallSite instead of templates. NFC.Ahmed Bougacha2017-03-102-21/+10
| | | | | | ImmutableCallSite abstracts away CallInst and InvokeInst. Use it! llvm-svn: 297426
* [GlobalISel] Fallback when failing to translate invoke.Ahmed Bougacha2017-03-101-3/+4
| | | | | | | | We unintentionally stopped falling back in r293670. While there, change an unusual construct. llvm-svn: 297425
* Add support for DenseMap/DenseSet count and find using const pointersDaniel Berlin2017-03-101-3/+3
| | | | | | | | | | | | | | Summary: Similar to SmallPtrSet, this makes find and count work with both const referneces and const pointers. Reviewers: dblaikie Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30713 llvm-svn: 297424
* GlobalISel: support trivial inlineasm calls.Tim Northover2017-03-091-1/+20
| | | | | | They're used for nefarious purposes by ObjC. llvm-svn: 297422
* Refactor alias check from MISched into common helper. NFC.Eli Friedman2017-03-092-61/+60
| | | | | | Differential Revision: https://reviews.llvm.org/D30598 llvm-svn: 297421
* [WebAssembly] Fix the opcode numbers for floating-point le and gt.Dan Gohman2017-03-091-2/+2
| | | | llvm-svn: 297420
* [DAGCombiner] Do various combine on uaddo.Amaury Sechet2017-03-091-0/+35
| | | | | | | | | | | | Summary: This essentially does the same transform as for ADC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30417 llvm-svn: 297416
* [Hexagon] Fixes to the bitsplit generationKrzysztof Parzyszek2017-03-091-11/+46
| | | | | | | | - Fix the insertion point, which occasionally could have been incorrect. - Avoid creating multiple bitsplits with the same operands, if an old one could be reused. llvm-svn: 297414
* GlobalISel: inform FrameLowering when we emit a function call.Tim Northover2017-03-092-0/+2
| | | | | | | Amongst other things (I expect) this is necessary to ensure decent backtraces when an "unreachable" is involved. llvm-svn: 297413
* [InstSimplify] allow folds for bool vector div/remSanjay Patel2017-03-091-3/+3
| | | | llvm-svn: 297411
* GlobalISel: put debug info for static allocas in the MachineFunction.Tim Northover2017-03-092-8/+9
| | | | | | | | | | | | | | The good reason to do this is that static allocas are pretty simple to handle (especially at -O0) and avoiding tracking DBG_VALUEs throughout the pipeline should give some kind of performance benefit. The bad reason is that the debug pipeline is an unholy mess of implicit contracts, where determining whether "DBG_VALUE %reg, imm" actually implies a load or not involves the services of at least 3 soothsayers and the sacrifice of at least one chicken. And it still gets it wrong if the variable is at SP directly. llvm-svn: 297410
* [ConstantFold] vector div/rem with any zero element in divisor is undefSanjay Patel2017-03-091-4/+9
| | | | | | | | Follow-up for: https://reviews.llvm.org/D30665 https://reviews.llvm.org/rL297390 llvm-svn: 297409
* AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsicsMatt Arsenault2017-03-091-0/+41
| | | | llvm-svn: 297408
* [DAGCombiner] Do various combine on usubo.Amaury Sechet2017-03-091-0/+34
| | | | | | | | | | | | Summary: This essentially does the same transform as for SUBC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30437 llvm-svn: 297404
* [Hexagon] Refactor the DAG preprocessing code, NFCKrzysztof Parzyszek2017-03-091-32/+88
| | | | | | Extract individual transformations into their own functions. llvm-svn: 297401
* Minor format change. nfc.Rong Xu2017-03-091-5/+5
| | | | llvm-svn: 297400
* [Hexagon] Add -mhvx option to the Hexagon backendKrzysztof Parzyszek2017-03-091-2/+8
| | | | llvm-svn: 297393
* [Hexagon] Propagate zext of i1 into arithmetic code in selection DAGKrzysztof Parzyszek2017-03-091-0/+96
| | | | | | | (op ... (zext i1 c) ...) -> (select c (op ... 1 ...), (op ... 0 ...)) llvm-svn: 297391
* [InstSimplify] vector div/rem with any zero element in divisor is undefSanjay Patel2017-03-091-0/+11
| | | | | | | | | | | This was suggested as a DAG simplification in the review for rL297026 : http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435253.html ...but let's start with IR since we have actual docs for IR (LangRef). Differential Revision: https://reviews.llvm.org/D30665 llvm-svn: 297390
* [DAG] recognize div/rem by 0 as undef before trying constant foldingSanjay Patel2017-03-092-11/+17
| | | | | | | | | | | | | | | | | | | | As discussed in the review thread for rL297026, this is actually 2 changes that would independently fix all of the test cases in the patch: 1. Return undef in FoldConstantArithmetic for div/rem by 0. 2. Move basic undef simplifications for div/rem (simplifyDivRem()) before foldBinopIntoSelect() as a matter of efficiency. I will handle the case of vectors with any zero element as a follow-up. That change is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic(). I'm deleting the test for PR30693 because it does not test for the actual bug any more (dangers of using bugpoint). Differential Revision: https://reviews.llvm.org/D30741 llvm-svn: 297384
* [X86][SSE] Speed up constant pool shuffle mask decoding with direct copy ↵Simon Pilgrim2017-03-091-7/+27
| | | | | | | | (PR32037). If the constants are already the correct size, we can copy them directly into the shuffle mask. llvm-svn: 297381
* [mips] Revert fixes for PR32020.Simon Dardis2017-03-097-400/+162
| | | | | | | | | | | | | | | The fix introduces segfaults and clobbers the value to be stored when the atomic sequence loops. Revert "[Target/MIPS] Kill dead code, no functional change intended." This reverts commit r296153. Revert "Recommit "[mips] Fix atomic compare and swap at O0."" This reverts commit r296134. llvm-svn: 297380
* Fixed typos in comments. NFCI.Simon Pilgrim2017-03-091-6/+6
| | | | llvm-svn: 297379
* fix build on CygwinNuno Lopes2017-03-091-0/+3
| | | | llvm-svn: 297378
* [ARM] remove FIXMEs and add vcmp MC testSjoerd Meijer2017-03-091-5/+0
| | | | | | | | | Minor cleanup in ARMInstrVFP.td: removed some FIXMEs and added a MC test for vcmp that was actually missing. Differential Revision: https://reviews.llvm.org/D30745 llvm-svn: 297376
* [PM/Inliner] Make the new PM's inliner process call edges across anChandler Carruth2017-03-091-29/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | entire SCC before iterating on newly-introduced call edges resulting from any inlined function bodies. This more closely matches the behavior of the old PM's inliner. While it wasn't really clear to me initially, this behavior is actually essential to the inliner behaving reasonably in its current design. Because the inliner is fundamentally a bottom-up inliner and all of its cost modeling is designed around that it often runs into trouble within an SCC where we don't have any meaningful bottom-up ordering to use. In addition to potentially cyclic, infinite inlining that we block with the inline history mechanism, it can also take seemingly simple call graph patterns within an SCC and turn them into *insanely* large functions by accidentally working top-down across the SCC without any of the threshold limitations that traditional top-down inliners use. Consider this diabolical monster.cpp file that Richard Smith came up with to help demonstrate this issue: ``` template <int N> extern const char *str; void g(const char *); template <bool K, int N> void f(bool *B, bool *E) { if (K) g(str<N>); if (B == E) return; if (*B) f<true, N + 1>(B + 1, E); else f<false, N + 1>(B + 1, E); } template <> void f<false, MAX>(bool *B, bool *E) { return f<false, 0>(B, E); } template <> void f<true, MAX>(bool *B, bool *E) { return f<true, 0>(B, E); } extern bool *arr, *end; void test() { f<false, 0>(arr, end); } ``` When compiled with '-DMAX=N' for various values of N, this will create an SCC with a reasonably large number of functions. Previously, the inliner would try to exhaust the inlining candidates in a single function before moving on. This, unfortunately, turns it into a top-down inliner within the SCC. Because our thresholds were never built for that, we will incrementally decide that it is always worth inlining and proceed to flatten the entire SCC into that one function. What's worse, we'll then proceed to the next function, and do the exact same thing except we'll skip the first function, and so on. And at each step, we'll also make some of the constant factors larger, which is awesome. The fix in this patch is the obvious one which makes the new PM's inliner use the same technique used by the old PM: consider all the call edges across the entire SCC before beginning to process call edges introduced by inlining. The result of this is essentially to distribute the inlining across the SCC so that every function incrementally grows toward the inline thresholds rather than allowing the inliner to grow one of the functions vastly beyond the threshold. The code for this is a bit awkward, but it works out OK. We could consider in the future doing something more powerful here such as prioritized order (via lowest cost and/or profile info) and/or a code-growth budget per SCC. However, both of those would require really substantial work both to design the system in a way that wouldn't break really useful abstraction decomposition properties of the current inliner and to be tuned across a reasonably diverse set of code and workloads. It also seems really risky in many ways. I have only found a single real-world file that triggers the bad behavior here and it is generated code that has a pretty pathological pattern. I'm not worried about the inliner not doing an *awesome* job here as long as it does *ok*. On the other hand, the cases that will be tricky to get right in a prioritized scheme with a budget will be more common and idiomatic for at least some frontends (C++ and Rust at least). So while these approaches are still really interesting, I'm not in a huge rush to go after them. Staying even closer to the existing PM's behavior, especially when this easy to do, seems like the right short to medium term approach. I don't really have a test case that makes sense yet... I'll try to find a variant of the IR produced by the monster template metaprogram that is both small enough to be sane and large enough to clearly show when we get this wrong in the future. But I'm not confident this exists. And the behavior change here *should* be unobservable without snooping on debug logging. So there isn't really much to test. The test case updates come from two incidental changes: 1) We now visit functions in an SCC in the opposite order. I don't think there really is a "right" order here, so I just update the test cases. 2) We no longer compute some analyses when an SCC has no call instructions that we consider for inlining. llvm-svn: 297374
OpenPOWER on IntegriCloud