summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [PGO][CHR] A bug fix.Hiroshi Yamauchi2019-05-011-6/+21
| | | | | | | | | | | | | | | | Summary: Fix a transformation bug where two scopes share a common instrution to hoist. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61405 llvm-svn: 359736
* [ORC] Pass object buffer ownership back in NotifyEmitted.Lang Hames2019-05-011-19/+18
| | | | | | | | | | Clients who want to regain ownership of object buffers after they have been linked may now use the NotifyEmitted callback for this purpose. Note: Currently NotifyEmitted is only called if linking succeeds. If linking fails the buffer is always discarded. llvm-svn: 359735
* [GlobalISel][AArch64] Use fmov for G_FCONSTANT when possibleJessica Paquette2019-05-011-2/+46
| | | | | | | | | | This adds support for using fmov rather than a standard mov to materialize G_FCONSTANT when it's safe to do so. Update arm64-fast-isel-materialize.ll and select-constant.mir to show that the selection is correct. llvm-svn: 359734
* [X86][SSE] Fold scalar horizontal add/sub for non-0/1 element extractionsSimon Pilgrim2019-05-011-6/+11
| | | | | | | | We already perform horizontal add/sub if we extract from elements 0 and 1, this patch extends it to non-0/1 element extraction indices (as long as they are from the lowest 128-bit vector). Differential Revision: https://reviews.llvm.org/D61263 llvm-svn: 359707
* [AMDGPU] gfx1010 GCNRegBankReassign passStanislav Mekhanoshin2019-05-014-0/+803
| | | | | | | | Reassign registers to reduce register bank conflicts. Differential Revision: https://reviews.llvm.org/D61344 llvm-svn: 359704
* Option spell checking: Penalize delimiter flags if input has no argumentNico Weber2019-05-011-0/+9
| | | | | | | | | | | | | | | If the user passes a flag like `-version` to a program, it's more likely they mean `--version` than `-version:`, since there's no parameter passed. Hence, give delimited arguments a penalty of 1 if the user input doesn't contain the delimiter or no data after it. The motivation is that with this, lld-link can suggest "--version" instead of "-version:" for "-version" and "-nodefaultlib" instead of "-nodefaultlib:" for "-nodefaultlibs". Differential Revision: https://reviews.llvm.org/D61382 llvm-svn: 359701
* [AMDGPU] gfx1010 GCNNSAReassign passStanislav Mekhanoshin2019-05-014-0/+362
| | | | | | | | Convert NSA into non-NSA images. Differential Revision: https://reviews.llvm.org/D61341 llvm-svn: 359700
* [AMDGPU] gfx1010 MIMG implementationStanislav Mekhanoshin2019-05-0112-161/+922
| | | | | | Differential Revision: https://reviews.llvm.org/D61339 llvm-svn: 359698
* [ThinLTO] Fix unreachable code when parsing summary entries.Teresa Johnson2019-05-011-5/+9
| | | | | | | | | | | | | | | | | | | Summary: Early returns were causing some code to be skipped. This was missed since the summary entries are typically at the end of the llvm assembly file. Fixes PR41663. Reviewers: RKSimon, wristow Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61355 llvm-svn: 359697
* [AMDGPU] gfx1010 DS implementationStanislav Mekhanoshin2019-05-013-165/+221
| | | | | | Differential Revision: https://reviews.llvm.org/D61332 llvm-svn: 359696
* Revert "[DAGCombiner] try repeated fdiv divisor transform before building ↵Sanjay Patel2019-05-011-3/+3
| | | | | | | | | estimate" This reverts commit fb9a5307a94e6f1f850e4d89f79103b123f16279 (rL359398) because it can cause an infinite loop due to opposing combines. llvm-svn: 359695
* Fix 80 column violation. NFCI.Simon Pilgrim2019-05-011-5/+6
| | | | llvm-svn: 359694
* [SCEV] Use isKnownViaNonRecursiveReasoning for smax simplificationKeno Fischer2019-05-011-3/+4
| | | | | | | | | | | | | | | | | | Summary: Commit rL331949: SCEV] Do not use induction in isKnownPredicate for simplification umax changed the codepath for umax from isKnownPredicate to isKnownViaNonRecursiveReasoning to avoid compile time blow up (and as I found out also stack overflows). However, there is an exact copy of the code for umax that was lacking this change. In D50167 I want to unify these codepaths, but to avoid that being a behavior change for the smax case, pull this independent bit out of it. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D61166 llvm-svn: 359693
* [X86][SSE] Add demanded elts support X86ISD::PMULDQ\PMULUDQSimon Pilgrim2019-05-011-3/+24
| | | | | | Add to SimplifyDemandedVectorEltsForTargetNode and SimplifyDemandedBitsForTargetNode llvm-svn: 359686
* Fix OptTable::findNearest() adding delimiter for freeNico Weber2019-05-011-9/+8
| | | | | | | | | | | Prior to this, OptTable::findNearest() thought that the input `--foo` had an editing distance of 0 from an existing flag `--foo=`, which made it suggest flags with delimiters more often than flags without one. After this, it correctly assigns this case an editing distance of 1. Differential Revision: https://reviews.llvm.org/D61373 llvm-svn: 359685
* [LoopInfo] Faster implementation of setLoopID. NFC.Keno Fischer2019-05-011-10/+4
| | | | | | | | | | | | | | | | Summary: This change was part of D46460. However, in the meantime rL341926 fixed the correctness issue here. What remained was the performance issue in setLoopID where it would iterate through all blocks in the loop and their successors, rather than just the predecessor of the header (the later presumably being much faster). We already have the `getLoopLatches` to compute precisely these basic blocks in an efficient manner, so just use it (as the original commit did for `getLoopID`). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61215 llvm-svn: 359684
* [X86][SSE] Add SSE vector shift support to ↵Simon Pilgrim2019-05-011-0/+21
| | | | | | SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359680
* Wrap to 80 columns, no behavior changeNico Weber2019-05-011-1/+2
| | | | llvm-svn: 359679
* [X86][SSE] Split 512-bit -> 128-bit vector directly in ↵Simon Pilgrim2019-05-011-1/+4
| | | | | | SimplifyDemandedVectorEltsForTargetNode llvm-svn: 359678
* [X86][SSE] Add 512-bit vector support to ↵Simon Pilgrim2019-05-011-8/+15
| | | | | | SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359677
* DAG: allow DAG pointer size different from memory representation.Tim Northover2019-05-014-47/+134
| | | | | | | | | | | | | | | | | | | | | In preparation for supporting ILP32 on AArch64, this modifies the SelectionDAG builder code so that pointers are allowed to have a larger type when "live" in the DAG compared to memory. Pointers get zero-extended whenever they are loaded, and truncated prior to stores. In addition, a few not quite so obvious locations need updating: * A GEP that has not been marked inbounds needs to enforce the IR-documented 2s-complement wrapping at the memory pointer size. Inbounds GEPs are undefined if they overflow the address space, so no additional operations are needed. * Signed comparisons would give incorrect results if performed on the zero-extended values. This shouldn't affect CodeGen for now, but will become active when the AArch64 ILP32 support is committed. llvm-svn: 359676
* [X86][SSE] Add X86ISD::PACKSS\PACKUS to ↵Simon Pilgrim2019-05-011-1/+7
| | | | | | SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359673
* [X86][SSE] Add X86ISD::UNPCKL\UNPCK to ↵Simon Pilgrim2019-05-011-2/+4
| | | | | | SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359670
* [X86][SSE] Move extract_subvector(pshufb) fold to ↵Simon Pilgrim2019-05-011-12/+3
| | | | | | | | SimplifyDemandedVectorEltsForTargetNode This lets us hit more cases than combineExtractSubvector and allows us reuse more code. llvm-svn: 359669
* [X86] SimplifyDemandedVectorEltsForTargetNode - pull out vector halving ↵Simon Pilgrim2019-05-011-10/+13
| | | | | | | | code. NFCI. Pull out the HADD/HSUB code to halve vector widths if the upper half isn't used - prep work to adding support for other opcodes. llvm-svn: 359667
* [X86][SSE] Extract i1 elements from vXi1 bool vectorsSimon Pilgrim2019-05-011-0/+33
| | | | | | | | This is an alternative to D59669 which more aggressively extracts i1 elements from vXi1 bool vectors using a MOVMSK. Differential Revision: https://reviews.llvm.org/D61189 llvm-svn: 359666
* [X86FixupLEAs] Hoist the calls to isLEA out of the 3 separate functions and ↵Craig Topper2019-05-011-14/+9
| | | | | | | | put it in the basic block instruction loop. NFC Now need to check it 3 different times. Just do it once at the top of the loop. llvm-svn: 359658
* Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract ↵David L. Jones2019-05-011-29/+0
| | | | | | | | element" This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313. llvm-svn: 359648
* [JITLink] Make sure we explicitly deallocate memory on failure.Lang Hames2019-05-012-4/+11
| | | | | | | | | | | | JITLinkGeneric phases 2 and 3 (focused on applying fixups and finalizing memory, respectively) may fail for various reasons. If this happens, we need to explicitly de-allocate the memory allocated in phase 1 (explicitly, because deallocation may also fail and so is implemented as a method returning error). No testcase yet: I am still trying to decide on the right way to test totally platform agnostic code like this. llvm-svn: 359643
* [WebAssembly] Update expectations for gcc torture testsSam Clegg2019-04-301-0/+12
| | | | | | | | | | This is needed to make the wasm waterfall green again after we land the update to WASI: https://github.com/WebAssembly/waterfall/pull/492 Differential Revision: https://reviews.llvm.org/D61351 llvm-svn: 359634
* [InstCombine] Limit a vector demanded elts rule which was producing invalid IR.Philip Reames2019-04-301-0/+12
| | | | | | | | The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes. This should fix https://bugs.llvm.org/show_bug.cgi?id=41624. llvm-svn: 359633
* [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.Alina Sbirlea2019-04-301-0/+9
| | | | | | | | | | | | | | | | Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: LLVM Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359627
* [ORC] Move SimpleCompiler/ConcurrentIRCompiler definitions into a .cpp file.Lang Hames2019-04-302-0/+87
| | | | | | | SimpleCompiler is no longer templated, so there's no reason for this code to be in a header any more. llvm-svn: 359626
* [AliasAnalysis/NewPassManager] Invalidate AAManager less often.Alina Sbirlea2019-04-301-4/+8
| | | | | | | | | | | | | | | | | | | | | Summary: This is a redo of D60914. The objective is to not invalidate AAManager, which is stateless, unless there is an explicit invalidate in one of the AAResults. To achieve this, this patch adds an API to PAC, to check precisely this: is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61284 llvm-svn: 359622
* [AMDGPU] gfx1010 VMEM and SMEM implementationStanislav Mekhanoshin2019-04-3016-317/+1071
| | | | | | Differential Revision: https://reviews.llvm.org/D61330 llvm-svn: 359621
* Fix a few -Werror warnings:Eric Christopher2019-04-301-4/+3
| | | | | | | - Remove a variable only used in an assert - Fix pessimizing move warning around copy elision llvm-svn: 359617
* [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.Alina Sbirlea2019-04-301-4/+2
| | | | | | | | | | | | | | | | | Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615
* [JITLink] Add debugging output to print resolved external atoms.Lang Hames2019-04-301-0/+6
| | | | llvm-svn: 359614
* [ORC][JITLink] Name in-memory compiled objects after their source modules.Lang Hames2019-04-301-1/+2
| | | | | | | | In-memory compiled object buffer identifiers will now be derived from the identifiers of their source IR modules. This makes it easier to connect in-memory objects with their source modules in debugging output. llvm-svn: 359613
* [llvm-profdata] Add overlap command to compute similarity b/w two profile filesRong Xu2019-04-303-0/+283
| | | | | | | | | Add overlap functionality to llvm-profdata tool to compute the similarity between two profile files. Differential Revision: https://reviews.llvm.org/D60977 llvm-svn: 359612
* [NFC][InlineCost] cleanup - comments, overflow handling.Fedor Sergeev2019-04-301-52/+61
| | | | | | | | Reviewed By: apilipenko Tags: #llvm Differential Revision: https://reviews.llvm.org/D60751 llvm-svn: 359609
* [X86][SSE] Fold extract_subvector(extend(x)) -> extend_vector_inreg(x)Simon Pilgrim2019-04-301-5/+7
| | | | | | | | This adds any extend support - folding to zero_extend_vector_inreg (PMOVZX) for legality Minor improvement for PR39709 llvm-svn: 359608
* Fix stack-use-after free after r359580Nico Weber2019-04-301-3/+4
| | | | | | | | `Candidate` was a StringRef refering to a temporary string. Instead, create a local variable for the string and use a StringRef referring to that. llvm-svn: 359604
* [WebAssembly] Support EXPLICIT_NAME symbols in llvm-readobjDan Gohman2019-04-301-0/+1
| | | | | | | | | Teach llvm-readobj about WASM_SYMBOL_EXPLICIT_NAME. Differential Revision: https://reviews.llvm.org/D61323 Reviewer: sbc100 llvm-svn: 359602
* [WebAssembly] Support f16 libcallsDan Gohman2019-04-302-1/+23
| | | | | | | | | | | | Add support for f16 libcalls in WebAssembly. This entails adding signatures for the remaining F16 libcalls, and renaming gnu_f2h_ieee/gnu_h2f_ieee to truncsfhf2/extendhfsf2 for consistency between f32 and f64/f128 (compiler-rt already supports this). Differential Revision: https://reviews.llvm.org/D61287 Reviewer: dschuff llvm-svn: 359600
* [X86] Remove if that's always trueCraig Topper2019-04-301-2/+1
| | | | | | | | It's been like this since it was added in a refactor of this code. Fixes PR41659 llvm-svn: 359597
* [SimplifyLibCalls] Clean up code (NFC)Evandro Menezes2019-04-301-6/+8
| | | | | | Fix pointer check after dereferencing (PR41665). llvm-svn: 359595
* [X86] If PreprocessISelDAG reorders a load before a call, make sure we ↵Craig Topper2019-04-301-0/+5
| | | | | | | | | | | | remove dead nodes from the graph The reordering can leave at least a dead TokenFactor in the graph. This cause the linearize scheduler to fail with something like the assert seen in PR22614. This is only one of many ways we can break the linearize scheduler today so I can't say for sure that any of the other failures in that bug were caused by this issue. This takes the heavy hammer approach of just running RemoveDeadNodes unconditionally at the end of the PreprocessISelDAG. If this turns out to be a compile time hit, we can try to refine it. Differential Revision: https://reviews.llvm.org/D61164 llvm-svn: 359582
* [X86] Initial cleanups on the FixupLEAs pass. Separate Atom LEA creation ↵Craig Topper2019-04-301-91/+75
| | | | | | | | | | | | | | | | | | | | | | from other LEA optimizations. This removes some of the class variables. Merge basic block processing into runOnMachineFunction to keep the flags local. Pass MachineBasicBlock around instead of an iterator. We can get the iterator in the few places that need it. Allows a range-based outer for loop. Separate the Atom optimization from the rest of the optimizations. This allows fixupIncDec to create INC/DEC and still allow Atom to turn it back into LEA when profitable by its heuristics. I'd like to improve fixupIncDec to turn LEAs into ADD any time the base or index register is equal to the destination register. This is profitable regardless of the various slow flags. But again we would want Atom to be able to undo that. Differential Revision: https://reviews.llvm.org/D60993 llvm-svn: 359581
* Re-reland "[Option] Fix PR37006 prefix choice in findNearest"Nico Weber2019-04-301-24/+24
| | | | | | | | | | | | | | | | This was first reviewed in https://reviews.llvm.org/D46776 and landed in r332299, but got reverted because it broke the PS4 bots. https://reviews.llvm.org/D50410 fixed this, and then this change was re-reviewed at https://reviews.llvm.org/D50515 and relanded in r341329. It got reverted due to causing MSan issues. However, nobody wrote down the error message and the bot link is dead, so I'm relanding this to capture the MSan error. I'll then either fix it, or copy it somewhere and revert if fixing looks difficult. llvm-svn: 359580
OpenPOWER on IntegriCloud