summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [LLVMSymbolize] Properly propagate object parsing errors from the library.Alexey Samsonov2015-11-043-106/+138
| | | | llvm-svn: 252021
* [llvm-symbolizer] Improve the test for missing input file.Alexey Samsonov2015-11-041-1/+3
| | | | llvm-svn: 252020
* Fix unused variable warning from r252017Adam Nemet2015-11-041-3/+2
| | | | llvm-svn: 252019
* Fix an issue where LLDB would truncate summaries for string types without ↵Enrico Granata2015-11-047-22/+99
| | | | | | producing any evidence thereof llvm-svn: 252018
* LLE 6/6: Add LoopLoadElimination passAdam Nemet2015-11-0313-0/+857
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The goal of this pass is to perform store-to-load forwarding across the backedge of a loop. E.g.: for (i) A[i + 1] = A[i] + B[i] => T = A[0] for (i) T = T + B[i] A[i + 1] = T The pass relies on loop dependence analysis via LoopAccessAnalisys to find opportunities of loop-carried dependences with a distance of one between a store and a load. Since it's using LoopAccessAnalysis, it was easy to also add support for versioning away may-aliasing intervening stores that would otherwise prevent this transformation. This optimization is also performed by Load-PRE in GVN without the option of multi-versioning. As was discussed with Daniel Berlin in http://reviews.llvm.org/D9548, this is inferior to a more loop-aware solution applied here. Hopefully, we will be able to remove some complexity from GVN/MemorySSA as a consequence. In the long run, we may want to extend this pass (or create a new one if there is little overlap) to also eliminate loop-indepedent redundant loads and store that *require* versioning due to may-aliasing intervening stores/loads. I have some motivating cases for store elimination. My plan right now is to wait for MemorySSA to come online first rather than using memdep for this. The main motiviation for this pass is the 456.hmmer loop in SPECint2006 where after distributing the original loop and vectorizing the top part, we are left with the critical path exposed in the bottom loop. Being able to promote the memory dependence into a register depedence (even though the HW does perform store-to-load fowarding as well) results in a major gain (~20%). This gain also transfers over to x86: it's around 8-10%. Right now the pass is off by default and can be enabled with -enable-loop-load-elim. On the LNT testsuite, there are two performance changes (negative number -> improvement): 1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the critical paths is reduced 2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this outside of LNT The pass is scheduled after the loop vectorizer (which is after loop distribution). The rational is to try to reuse LAA state, rather than recomputing it. The order between LV and LLE is not critical because normally LV does not touch scalar st->ld forwarding cases where vectorizing would inhibit the CPU's st->ld forwarding to kick in. LoopLoadElimination requires LAA to provide the full set of dependences (including forward dependences). LAA is known to omit loop-independent dependences in certain situations. The big comment before removeDependencesFromMultipleStores explains why this should not occur for the cases that we're interested in. Reviewers: dberlin, hfinkel Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13259 llvm-svn: 252017
* [LAA] LLE 5/6: Add predicate functions Dependence::isForward/isBackward, NFCAdam Nemet2015-11-032-3/+28
| | | | | | | | | | | | Summary: Will be used by the LoopLoadElimination pass. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13258 llvm-svn: 252016
* [LAA] LLE 4/6: APIs to access the dependent instructions for a dependence, NFCAdam Nemet2015-11-031-0/+17
| | | | | | | | | | | | | | | | Summary: The functions use LAI and MemoryDepChecker classes so they need to be defined after those definitions outside of the Dependence class. Will be used by the LoopLoadElimination pass. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13257 llvm-svn: 252015
* CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.Peter Collingbourne2015-11-035-27/+37
| | | | | | | | | | | | | | | | | A profile of an LTO link of Chrome revealed that we were spending some ~30-50% of execution time in the function Constant::getRelocationInfo(), which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn from TargetMachine::getNameWithPrefix(). It turns out that we only need the result of getKindForGlobal() when targeting Mach-O, so this change moves the relevant part of the logic to TargetLoweringObjectFileMachO. NFCI. Differential Revision: http://reviews.llvm.org/D14168 llvm-svn: 252014
* All instance variables start with "m_". Fix "options" to be "m_options".Greg Clayton2015-11-032-57/+57
| | | | llvm-svn: 252013
* Fix __fp16 types so we can display them and use them in expressions.Greg Clayton2015-11-031-3/+8
| | | | | | | | I am not adding a test case for this since I don't know how portable the __fp16 type is between compilers and I don't want to break the test suite. <rdar://problem/22375079> llvm-svn: 252012
* Simplify the logic to avoid the Closed set.Rafael Espindola2015-11-031-31/+28
| | | | | | | IMHO this makes the code easier to read: at each iteration we add a section to a PT_LOAD and increase its size. llvm-svn: 252011
* AMDGPU: Make flat_scratch name consistentMatt Arsenault2015-11-031-3/+3
| | | | | | | The printed name and the parsed assembler names weren't the same. I'm not sure which name SC prints these as, but I think it's this one. llvm-svn: 252010
* AMDGPU: Fix asserts on invalid register rangesMatt Arsenault2015-11-034-5/+73
| | | | | | | | | If the requested SGPR was not actually aligned, it was accepted and rounded down instead of rejected. Also fix an assert if the range is an invalid size. llvm-svn: 252009
* AMDGPU: Fix off by one error in register parsingMatt Arsenault2015-11-032-4/+19
| | | | | | If trying to use one past the end, this would assert. llvm-svn: 252008
* Fix build for go parser unittest.Ryan Brown2015-11-032-1/+13
| | | | llvm-svn: 252007
* [elf2] Use value-initialization instead of memset.Michael J. Spencer2015-11-031-2/+1
| | | | llvm-svn: 252006
* Fix a deadlock when connecting to a remote GDB server that might not support ↵Greg Clayton2015-11-031-16/+20
| | | | | | | | all packets that lldb-server or debugserver supports. The issue was the m_last_stop_packet_mutex mutex was being held by another thread and it was deadlocking getting the thread list. We now try to lock the m_last_stop_packet_mutex, and only continue if we successfully lock it. Else we fall back to qfThreadInfo/qsThreadInfo. <rdar://problem/22140023> llvm-svn: 252005
* Address nitDerek Schuff2015-11-031-32/+32
| | | | llvm-svn: 252004
* Align whitespaceDerek Schuff2015-11-032-4/+4
| | | | llvm-svn: 252003
* [WebAssembly] Support wasm select operatorDerek Schuff2015-11-033-0/+73
| | | | | | | | | | | | | | Summary: Add support for wasm's select operator, and lower LLVM's select DAG node to it. Reviewers: sunfish Subscribers: dschuff, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D14295 llvm-svn: 252002
* With the new modules debugging, we have seen cases where clang is not ↵Greg Clayton2015-11-031-0/+17
| | | | | | emitting full definitions for types that are member variables of classes. If we try to make a class with a member where the type of the class in a forward declaration, clang will assert and crash and bring down the IDE. This is not acceptable so we need to work around it. We work around it by making sure that if we have a member that is an instance (not a pointer or reference) of a class/struct/union, that it is a complete type. If it isn't then we emit an error to let the user know to file a bug against the compiler, and then we make the class complete, but empty. We also do this for base classes elsewhere. We use the DWARF to help layout the type, so we will get all instance variables correct, but we just won't have visibility into this instance variable. llvm-svn: 252001
* AMDGPU: s[102:103] is unavailable on VIMatt Arsenault2015-11-031-1/+10
| | | | llvm-svn: 252000
* AMDGPU: Define correct number of SGPRsMatt Arsenault2015-11-033-6/+13
| | | | | | | | | There are actually 104 so 2 were missing. More assembler tests with high register number tuples will be included in later patches. llvm-svn: 251999
* [elf2] Implement R_X86_64_TPOFF32.Michael J. Spencer2015-11-035-4/+73
| | | | | | This does not support TPOFF32 relocations to local symbols as the address calculations are separate. Support for this will be a separate patch. llvm-svn: 251998
* Revert change comitted in accidentally as r251992Tamas Berghammer2015-11-031-3/+0
| | | | llvm-svn: 251997
* AMDGPU: Make findUsedSGPR more readableMatt Arsenault2015-11-031-7/+18
| | | | | | Add more comments etc. llvm-svn: 251996
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-033-8/+15
| | | | llvm-svn: 251995
* AMDGPU: Alphabetize includesMatt Arsenault2015-11-031-1/+1
| | | | llvm-svn: 251994
* Use std::list::splice in TaskPool to avoid an allocationTamas Berghammer2015-11-031-2/+1
| | | | | | | Using std::list::splice to move an element from one list to an other avoids the allocation of a new element and a move of the data. llvm-svn: 251993
* wipTamas Berghammer2015-11-031-0/+3
| | | | llvm-svn: 251992
* InstCombine: fix sinking of convergent callsFiona Glaser2015-11-031-0/+6
| | | | llvm-svn: 251991
* [SelectionDAG] Use existing constant nodes instead of recreating them. NFC.Simon Pilgrim2015-11-031-9/+6
| | | | llvm-svn: 251990
* [LLVMSymbolize] Factor out the logic for printing structs from DIContext. NFC.Alexey Samsonov2015-11-036-80/+135
| | | | | | | | Introduce DIPrinter which takes care of rendering DILineInfo and friends. This allows LLVMSymbolizer class to return a structured data instead of plain std::strings. llvm-svn: 251989
* Handle 0 sized sections like any other section.Rafael Espindola2015-11-0313-82/+105
| | | | | | | | | | | | | | | | | | This is a case where there is inconsistency among ELF linkers: * The spec says nothing special about empty sections. * BFD ld removes them. * Gold handles them like regular sections. We were outputting them but sometimes ignoring them. This would create odd looking outputs where a rw section could be in a ro segment for example. The bfd way of doing things is also strange for the case where a symbol points to the empty section. Now we match gold and what seems to be the intention of the spec. llvm-svn: 251988
* Remove redundant = nullptr.Rafael Espindola2015-11-031-1/+1
| | | | llvm-svn: 251987
* [X86][AVX] Tweaked shuffle stack folding testsSimon Pilgrim2015-11-032-5/+5
| | | | | | To avoid alternative lowerings. llvm-svn: 251986
* [LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFCAdam Nemet2015-11-0311-74/+63
| | | | | | | | | | | | | | Summary: We now collect all types of dependences including lexically forward deps not just "interesting" ones. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13256 llvm-svn: 251985
* [X86][AVX512] Fixed shuffle test name to match shuffleSimon Pilgrim2015-11-031-2/+2
| | | | llvm-svn: 251984
* Python 3 - Fix checking of string types in unittest2 module.Zachary Turner2015-11-035-11/+17
| | | | | | | | | | This patch actually introduces a dependency from unittest2 to six. This should be ok since both packages are in our own repo, and we assume a sys.path of the top-level script that can find the third party packages. So unittest2 should be able to find six. llvm-svn: 251983
* Introduce seven.cmp_ and use it instead of cmpZachary Turner2015-11-032-1/+5
| | | | llvm-svn: 251982
* [LLVMSymbolize] Move demangling away from printing routines. NFC.Alexey Samsonov2015-11-032-34/+36
| | | | | | | | | Make printDILineInfo and friends responsible for just rendering the contents of the structures, demangling should actually be performed earlier, when we have the information about the originating SymbolizableModule at hand. llvm-svn: 251981
* Create .bss only when needed.Rafael Espindola2015-11-039-122/+66
| | | | | | | This is a small complication, but produces nicer output and is a step to handling zero size sections uniformly. llvm-svn: 251980
* Squelch a silly warning regarding an extra 'default' in 'case'Ramkumar Ramachandra2015-11-031-39/+38
| | | | | | | | | | | | Let the editor also clean up whitespace for that file. Reviewers: clayborg Subscribers: lldb-commits Differential Revision: http://reviews.llvm.org/D13816 llvm-svn: 251979
* Python 3 - Fix some issues in unittest2.Zachary Turner2015-11-034-12/+22
| | | | | | | | | unittest2 was using print statements in a few places, and also using the `cmp` function (which is removed in Python 3). Again, we need to stop using unittest2 and using unittest instead, but this seems like an easier route for now. llvm-svn: 251978
* Python 3: Modernize exception raising syntax.Zachary Turner2015-11-032-3/+3
| | | | | | | | | | Old-style: `raise foo, bar` New-style: `raise foo(bar)` These two statements are equivalent, but the former is an error in Python 3. llvm-svn: 251977
* [SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)Davide Italiano2015-11-034-0/+81
| | | | | | | | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976
* [X86][XOP] Add support for the matching of the VPCMOV bit select instructionSimon Pilgrim2015-11-034-3/+185
| | | | | | | | | | XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) ) This patch adds tablegen pattern matching for this instruction. Differential Revision: http://reviews.llvm.org/D8841 llvm-svn: 251975
* llmv-pdbdump: Make BuiltinDumper shorter. NFC.Rui Ueyama2015-11-032-41/+27
| | | | llvm-svn: 251974
* [LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependenceAdam Nemet2015-11-033-1/+73
| | | | | | | | | | | | | | | | | | | | | Summary: When the dependence distance in zero then we have a loop-independent dependence from the earlier to the later access. No current client of LAA uses forward dependences so other than potentially hitting the MaxDependences threshold earlier, this change shouldn't affect anything right now. This and the previous patch were tested together for compile-time regression. None found in LNT/SPEC. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13255 llvm-svn: 251973
* [LAA] LLE 1/6: Expose Forward dependencesAdam Nemet2015-11-033-13/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before this change, we didn't use to collect forward dependences since none of the current clients (LV, LDist) required them. The motivation to also collect forward dependences is a new pass LoopLoadElimination (LLE) which discovers store-to-load forwarding opportunities across the loop's backedge. The pass uses both lexically forward or backward loop-carried dependences to detect these opportunities. The new pass also analyzes loop-independent (forward) dependences since they can conflict with the loop-carried dependences in terms of how the data flows through memory. The newly added test only covers loop-carried forward dependences because loop-independent ones are currently categorized as NoDep. The next patch will fix this. The two patches were tested together for compile-time regression. None found in LNT/SPEC. Note that with this change LAA provides all dependences rather than just "interesting" ones. A subsequent NFC patch will remove the now trivial isInterestingDependence and rename the APIs. Reviewers: hfinkel Subscribers: jmolloy, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13254 llvm-svn: 251972
OpenPOWER on IntegriCloud