summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [FastISel][SelectionDAG]Teach fastISel about GC intrinsicsAnna Thomas2017-07-041-1/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: We are crashing in LLC at O0 when gc intrinsics are present in the block. The reason being FastISel performs basic block ISel by modifying GC.relocates to be the first instruction in the block. This can cause us to visit the GC relocate before it's corresponding GC.statepoint is visited, which is incorrect. When we lower the statepoint, we record the base and derived pointers, along with the gc.relocates. After this we can visit the gc.relocate. This patch avoids fastISel from incorrectly creating the block with gc.relocate as the first instruction. Reviewers: qcolombet, skatkov, qikon, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34421 llvm-svn: 307084
* [AMDGPU] Fix latency of MIMG instructionsMarek Olsak2017-07-041-0/+1
| | | | | | Patch by cwabbott (Connor Abbott). llvm-svn: 307081
* [globalisel][tablegen] Partially fix compile-time regressions by converting ↵Daniel Sanders2017-07-044-0/+9
| | | | | | | | | | | | | | | | | | | | | | matcher to state-machine(s) Summary: Replace the matcher if-statements for each rule with a state-machine. This significantly reduces compile time, memory allocations, and cumulative memory allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is recommitted. The following patches will expand on this further to fully fix the regressions. Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33758 llvm-svn: 307079
* [LoopDeletion] NFC: Add debug statements to the optimizationAnna Thomas2017-07-041-13/+24
| | | | | | | We have a DEBUG option for loop deletion, but no related debug messages. Added some debug messages to state why loop deletion failed. llvm-svn: 307078
* [InstCombine] Add TODOs for a couple things that should maybe be in ↵Craig Topper2017-07-041-1/+3
| | | | | | InstSimplify instead. NFC llvm-svn: 307065
* [X86] Add comment string for broadcast loads from the constant pool.Craig Topper2017-07-041-37/+156
| | | | | | | | | | | | | | | | | Summary: When broadcasting from the constant pool its useful to print out the final vector similar to what we do for normal moves from the constant pool. I changed only a couple tests that were broadcast focused. One of them had been previously hand tweaked after running the script so that it could check the constant pool declaration. But I think this patch makes that unnecessary now since we can check the comment instead. Reviewers: spatel, RKSimon, zvi Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34923 llvm-svn: 307062
* [X86] Add RDRAND feature to GLM CPUCraig Topper2017-07-041-0/+1
| | | | | | | | | | | | | | Summary: I believe this should be supported on GLM since RDSEED is. Reviewers: m_zuckerman, zvi, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34828 llvm-svn: 307060
* [Orc] Remove the memory manager argument to addModule, and de-templatize theLang Hames2017-07-042-12/+18
| | | | | | | | | | | | | | | | symbol resolver argument. De-templatizing the symbol resolver is part of the ongoing simplification of ORC layer API. Removing the memory management argument (and delegating construction of memory managers for RTDyldObjectLinkingLayer to a functor passed in to the constructor) allows us to build JITs whose base object layers need not be compatible with RTDyldObjectLinkingLayer's memory mangement scheme. For example, a 'remote object layer' that sends fully relocatable objects directly to the remote does not need a memory management scheme at all (that will be handled by the remote). llvm-svn: 307058
* [AVR] Fix bug which caused assertion errors for some FRMIDX instructionsDylan McKay2017-07-041-3/+8
| | | | | | | | | | | | | Previously, if a basic block ended with a FRMIDX instruction, we would end up doing something like this. *std::next(MBB.end()) Which would hit an error: "Assertion `!NodePtr->isKnownSentinel()' failed." llvm-svn: 307057
* [AVR] Add a missing clobber declaration to LPMWDylan McKay2017-07-041-6/+6
| | | | llvm-svn: 307056
* [DAG] Fixed predicate for determining when two frame indicesNirav Dave2017-07-041-5/+5
| | | | | | addresses are comparable. NFCI. llvm-svn: 307055
* Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"NAKAMURA Takumi2017-07-041-1/+1
| | | | | | | | | It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll llvm-svn: 307054
* [legalize-types] Clean up softening machinery.Anton Yartsev2017-07-044-42/+89
| | | | | | | | The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265. Differential Revision: https://reviews.llvm.org/D31946 llvm-svn: 307053
* [X86][SSE4A] Add support for combining from EXTRQI/INSERTQI shufflesSimon Pilgrim2017-07-031-0/+22
| | | | llvm-svn: 307048
* DAGCombine: Combine BUILD_VECTOR to TRUNCATEZvi Rackover2017-07-032-0/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a combine for creating a truncate to replace a build_vector composed of extracts with indices that form a stride-2^N series. Example: v8i32 V = ... v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6)) --> v4i32 truncate (bitcast V to v4i64) Related discussion in llvm-dev about canonicalizing shuffles to truncates in LLVM IR: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html. Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena Reviewed By: delena Subscribers: guyblank, delena, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34077 llvm-svn: 307036
* [LoopInterchange] Add more debug messages to currentLimitations(). Florian Hahn2017-07-031-10/+34
| | | | | | | | | | | | | | Summary: This makes it easier to find out which limitation prevented this pass from doing its work. Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34940 llvm-svn: 307035
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-031-1/+1
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026
* Revert "[GVN] Recommit the patch "Add phi-translate support in scalarpre"."Benjamin Kramer2017-07-031-158/+28
| | | | | | | This reverts commit r306313. This breaks selfhost at -O3 and PR33652. Let me know if you need additional information on reproducing the issue. llvm-svn: 307021
* [GlobalISel][X86] fix %ptr(p0) = G_CONSTANT selection.Igor Breger2017-07-031-1/+2
| | | | llvm-svn: 307019
* fix trivial typos in comments; NFCHiroshi Inoue2017-07-035-5/+5
| | | | llvm-svn: 307004
* [InstCombine] Add a TODO for a probable missing single use check. NFCCraig Topper2017-07-031-0/+2
| | | | | | Will try to fix it soon, but in case I forget. llvm-svn: 307003
* [InstCombine] Support BITWISE_OP( BSWAP(x), CONSTANT ) -> BSWAP( ↵Craig Topper2017-07-031-18/+12
| | | | | | BITWISE_OP(x, BSWAP(CONSTANT) ) ) for splat vectors. llvm-svn: 307002
* [InstCombine] Remove support for BITWISE_OP(CONSTANT, BSWAP(x)) -> ↵Craig Topper2017-07-031-7/+2
| | | | | | | | BSWAP(OP(BSWAP(CONSTANT), x)). Constants were already canonicalized to the right hand side before we got here. llvm-svn: 307000
* [InstCombine] Support BITWISE_OP(BSWAP(A),BSWAP(B))->BSWAP(BITWISE_OP(A, B)) ↵Craig Topper2017-07-031-7/+3
| | | | | | for vectors. llvm-svn: 306999
* [InstCombine] Remove an if that should have been guaranteed by the caller. ↵Craig Topper2017-07-031-4/+2
| | | | | | Replace with an assert. NFC llvm-svn: 306997
* AMDGPU: Add operand target flags serializationMatt Arsenault2017-07-022-0/+26
| | | | llvm-svn: 306995
* [X86][AVX512VPOPCNTDQ] Improve support for v16i8/v8i16/v16i16/ CTPOPSimon Pilgrim2017-07-021-0/+14
| | | | | | Zero extend to v16i32/v8i64, use VPOPCNTDQ instructions and truncate back. llvm-svn: 306990
* [InstCombine] Use m_BitReverse pattern match helper. NFCI.Simon Pilgrim2017-07-021-2/+2
| | | | llvm-svn: 306986
* [InstCombine] fix crash when folding cmp+bswap vectorSanjay Patel2017-07-021-5/+9
| | | | | | | | | We assumed the constant was a scalar when creating the replacement operand. Also, improve tests for this fold and move the tests for this fold to their own file. I'll move the related and missing tests to this file as a follow-up. llvm-svn: 306985
* [InstCombine] look through bswap/bitreverse for equality comparisonsSanjay Patel2017-07-021-0/+9
| | | | | | | | | I noticed this missed bswap optimization in the CGP memcmp() expansion, and then I saw that we don't have the fold in InstCombine. Differential Revision: https://reviews.llvm.org/D34763 llvm-svn: 306980
* [X86][SSE] Attempt to combine 64-bit and 32-bit shuffles to unary shuffles ↵Simon Pilgrim2017-07-021-32/+21
| | | | | | | | before bit shifts We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive llvm-svn: 306978
* [X86][SSE] Attempt to combine 64-bit and 16-bit shuffles to unary shuffles ↵Simon Pilgrim2017-07-021-64/+56
| | | | | | | | | | before bit shifts We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive The 32-bit shuffles are a bit tricky and will be dealt with in a later patch llvm-svn: 306977
* [X86][CM] update add\sub costs of vectors of 64 in X86\SLM archMohammed Agabaria2017-07-021-4/+9
| | | | | | | | | this patch updates the cost of addq\subq (add\subtract of vectors of 64bits) based on the performance numbers of SLM arch. Differential Revision: https://reviews.llvm.org/D33983 llvm-svn: 306974
* [GlobalISel][X86] Support G_GLOBAL_VALUE operation.Igor Breger2017-07-022-8/+64
| | | | | | | | | | | | | | | | | Summary: Support G_GLOBAL_VALUE operation. For now most of the PIC configurations not implemented yet. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34738 Conflicts: test/CodeGen/X86/GlobalISel/regbankselect-X86_64.mir llvm-svn: 306972
* [GlobalISel][X86] Support vector type G_UNMERGE_VALUES selection.Igor Breger2017-07-021-0/+31
| | | | | | | | | | | | | | | | Summary: Support vector type G_UNMERGE_VALUES selection. For now G_UNMERGE_VALUES marked as legal for any type, so nothing to do in legalizer. Reviewers: t.p.northover, qcolombet, zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D33665 llvm-svn: 306971
* fix trivial typos; NFCHiroshi Inoue2017-07-024-4/+4
| | | | | | suport -> support llvm-svn: 306968
* [InstCombine] Fold (a | b) ^ (~a | ~b) --> ~(a ^ b) and (a & b) ^ (~a & ~b) ↵Craig Topper2017-07-021-2/+18
| | | | | | | | | | | | | | | | | | | --> ~(a ^ b) Summary: I came across this while thinking about what would happen if one of the operands in this xor pattern was itself a inverted (A & ~B) ^ (~A & B)-> (A^B). The patterns here assume that the (~a | ~b) will be demorganed to ~(a & b) first. Though I wonder if there's a multiple use case that would prevent the demorgan. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34870 llvm-svn: 306967
* [CodeExtractor] Remove unneded and commented out debugging stmts.Davide Italiano2017-07-021-6/+0
| | | | llvm-svn: 306966
* fix trivial typos, NFCHiroshi Inoue2017-07-012-3/+3
| | | | llvm-svn: 306952
* [SelectionDAGBuilder] Use EVT::getVectorVT instead of MVT::getVectorVT to ↵Craig Topper2017-07-011-1/+1
| | | | | | prevent a crash if the type isn't a simple VT. llvm-svn: 306950
* Revert "Revert "Replace trivial use of external rc.exe by writing our own ↵Eric Beckmann2017-07-012-18/+10
| | | | | | | | | | | | | | | | | | .res file."" Summary: This reverts commit 51931072a7c9a52540baf76fc30ef391d2529a2f. This revert was originally done because the integrations of the new WindowsResource library into LLD was causing error in chromium, due to bugs in how resource sections were handled. These bugs were fixed, meaning that the features may be reintegrated. Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D34922 llvm-svn: 306941
* Remove the default ARMSubtarget from the ARM TargetMachine.Eric Christopher2017-07-013-11/+20
| | | | | | | This enables us to ensure better LTO and code generation in the face of module linking. Remove a report_fatal_error from the TargetMachine and replace it with an assert in ARMSubtarget - and remove the test that depended on the error. The assertion will still fire in the case that we were reporting before, but error reporting needs to be in front end tools if possible for options parsing. llvm-svn: 306939
* [Cloner] Re-map simplfied cloned instructions.Davide Italiano2017-07-011-5/+4
| | | | | | | | | | | | | | | This commit pretty much rolls back the logic added in r306495 as in the testcase provided we simplify an `icmp` looking through a PHI that hasn't been mapped yet. I think instsimplify shouldn't do threading over select/phis or just looking through phis in general, but this is what we have now. Also, add a test to prevent this from happening in case somebody wants to modify this code again. Briefly discussed with Kyle Butt (thanks Kyle!). llvm-svn: 306938
* Recommit "r306541 - Add zero-length check to memcpy/memset load store loop ↵Teresa Johnson2017-07-011-5/+14
| | | | | | | | | | expansion"" With fix for use-after-free errors. We can't add the new branch and remove the old one until we are done with the Builder constructed for the block. llvm-svn: 306937
* Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by ↵Teresa Johnson2017-07-011-1/+1
| | | | | | | | | default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. llvm-svn: 306936
* re-commit r306336: Enable vectorizer-maximize-bandwidth by default.Teresa Johnson2017-07-011-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306935
* revert r306336 for breaking ppc test.Teresa Johnson2017-07-011-1/+1
| | | | llvm-svn: 306934
* Enable vectorizer-maximize-bandwidth by default.Teresa Johnson2017-07-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306933
* Rewrite ARM execute only support to avoid the use of a command line flag and ↵Eric Christopher2017-07-014-29/+21
| | | | | | | | unqualified ARMSubtarget lookup. Paired with a clang commit to use the new behavior. llvm-svn: 306927
* [ObjectYAML] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-07-0111-132/+227
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 306925
OpenPOWER on IntegriCloud