summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Allow lowering call sites with both funclets and deopt stateSanjoy Das2016-03-221-5/+1
| | | | | | | Lowering funclets is a no-op, so we can just go ahead and lower the deopt state. llvm-svn: 264078
* [WebAssembly] Implement the rotate instructions.Dan Gohman2016-03-222-1/+9
| | | | llvm-svn: 264076
* Add a hasOperandBundlesOtherThan helper, and use it; NFCSanjoy Das2016-03-221-12/+6
| | | | llvm-svn: 264072
* [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardwareSimon Pilgrim2016-03-223-1/+105
| | | | | | | | | | | | | | Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062
* [mips] Make simm6 consistent with the rest. NFC.Daniel Sanders2016-03-221-5/+1
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18147 llvm-svn: 264057
* [mips] Range check simm7.Daniel Sanders2016-03-225-21/+53
| | | | | | | | | | | | | | Summary: Also renamed li_simm7 to li16_imm since it's not a simm7 and has an unusual encoding (it's a uimm7 except that 0x7f represents -1). Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18145 llvm-svn: 264056
* [mips] Range check simm5.Daniel Sanders2016-03-223-3/+7
| | | | | | | | | | | | | | | Summary: We can't check the error message for this one because there's another lw/sw available that covers a larger range. We therefore check the transition between the two sizes. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18144 llvm-svn: 264054
* [mips] Range check vsplat_uimm[1234568].Daniel Sanders2016-03-222-48/+53
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18143 llvm-svn: 264053
* [mips] Range check uimm4_ptr, remove uimm6_ptr, and use correctly sized ↵Daniel Sanders2016-03-222-42/+52
| | | | | | | | | | | | immediates in MSA copy/insert. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18142 llvm-svn: 264052
* [PATCH] Force LoopReroll to reset the loop trip count value after reroll.Zinovy Nis2016-03-221-5/+8
| | | | | | | | | | | It's a bug fix. For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes. My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested. I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev. Differential Revision: http://reviews.llvm.org/D18316 llvm-svn: 264051
* [ELF][gcc compatibility]: support section names with special characters ↵Marina Yatsina2016-03-221-8/+9
| | | | | | | | | | | | (e.g. "/") Adding support for section names with special characters in them (e.g. "/"). GCC successfully compiles such section names. This also fixes PR24520. Differential Revision: http://reviews.llvm.org/D15678 llvm-svn: 264038
* Rename DenseMap::resize() into DenseMap::reserve() (NFC)Mehdi Amini2016-03-221-1/+1
| | | | | | | This is more coherent with usual containers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264026
* Minor code cleanup. NFC.Junmo Park2016-03-221-1/+1
| | | | llvm-svn: 264024
* Add "first class" lowering for deopt operand bundlesSanjoy Das2016-03-224-24/+99
| | | | | | | | | | | | | | | | | Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015
* [sancov] do not instrument nodes that are full pre-dominatorsMike Aizatsky2016-03-211-10/+17
| | | | | | | | | | | | | Summary: Without tree pruning clang has 2,667,552 points. Wiht only dominators pruning: 1,515,586. With both dominators & predominators pruning: 1,340,534. Resubmit of r262103. Differential Revision: http://reviews.llvm.org/D18341 llvm-svn: 264003
* AMDGPU: Fix dangling references introduced by r263982Nicolai Haehnle2016-03-211-3/+5
| | | | | | | Fixes Valgrind errors on the test cases that were reported as failing by buildbots. llvm-svn: 264000
* [InstCombine] Ensure all undef operands are handled before binary ↵Simon Pilgrim2016-03-211-2/+16
| | | | | | | | | | instruction constant folding As noted in PR18355, this patch makes it clear that all cases with undef operands have been handled before further constant folding is attempted. Differential Revision: http://reviews.llvm.org/D18305 llvm-svn: 263994
* Fix -Wdocumentation warnings from r263853Duncan P. N. Exon Smith2016-03-211-4/+7
| | | | | | Thanks to chapuni for catching this. llvm-svn: 263993
* [MemorySSA] Consider def-only BBs for live-in calculations.George Burgess IV2016-03-211-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | If we have a BB with only MemoryDefs, live-in calculations will ignore it. This means we get results like this: define void @foo(i8* %p) { ; 1 = MemoryDef(liveOnEntry) store i8 0, i8* %p br i1 undef, label %if.then, label %if.end if.then: ; 2 = MemoryDef(1) store i8 1, i8* %p br label %if.end if.end: ; 3 = MemoryDef(1) store i8 2, i8* %p ret void } ...When there should be a MemoryPhi in the `if.end` BB. This patch fixes that behavior. llvm-svn: 263991
* AMDGPU: Coding style fixesNicolai Haehnle2016-03-211-4/+2
| | | | | | | I meant to add these before committing r263982 as per the review, but I forgot to squash. llvm-svn: 263983
* AMDGPU: Add SIWholeQuadMode passNicolai Haehnle2016-03-219-15/+515
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982
* [Hexagon] Add handling fixups and instruction relaxationKrzysztof Parzyszek2016-03-211-112/+451
| | | | llvm-svn: 263981
* [Hexagon] Properly encode registers in duplex instructionsKrzysztof Parzyszek2016-03-213-6/+126
| | | | llvm-svn: 263980
* [Hexagon] Fix reserving emergency spill slots for register scavengerKrzysztof Parzyszek2016-03-213-35/+11
| | | | | | | - R10 and R11 are not reserved registers. - Check for reserved registers when finding unused caller-saved registers. llvm-svn: 263977
* [WebAssembly] Implement the eqz instructions.Dan Gohman2016-03-211-0/+7
| | | | llvm-svn: 263976
* [SLP] Remove unnecessary member variables by using container APIs.Chad Rosier2016-03-211-13/+6
| | | | | | | This changes the debug output, but still retains its usefulness. Differential Revision: http://reviews.llvm.org/D18324 llvm-svn: 263975
* AMDGPU/SI: Fix threshold calculation for branching when exec is zeroTom Stellard2016-03-211-3/+5
| | | | | | | | | | | | | | | | | | | Summary: When control flow is implemented using the exec mask, the compiler will insert branch instructions to skip over the masked section when exec is zero if the section contains more than a certain number of instructions. The previous code would only count instructions in successor blocks, and this patch modifies the code to start counting instructions in all blocks between the start and end of the branch. Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18282 llvm-svn: 263969
* [AArch64] Add a helpful assert. NFC.Chad Rosier2016-03-211-0/+1
| | | | llvm-svn: 263965
* AMDGPU: Remove SignBitIsZero for mubuf scratch offsetsMatt Arsenault2016-03-211-1/+1
| | | | | | | These instructions do not have the same negative base address problem that DS instructions do on SI. llvm-svn: 263964
* ARM: Better codegen for 64-bit compares.Peter Collingbourne2016-03-212-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962
* [ARM] Add Cortex-A32 supportRenato Golin2016-03-212-2/+10
| | | | | | | | Adding Cortex-A32 as an available target in the ARM backend. Patch by Sam Parker. llvm-svn: 263956
* APFloat: Add frexpMatt Arsenault2016-03-211-1/+25
| | | | llvm-svn: 263950
* AMDGPU: Add frexp_mant intrinsicMatt Arsenault2016-03-211-2/+2
| | | | llvm-svn: 263948
* Implement constant folding for bitreverseMatt Arsenault2016-03-212-0/+33
| | | | llvm-svn: 263945
* [AArch64] Fix a -Wdocumentation warning. NFC.Chad Rosier2016-03-211-2/+2
| | | | llvm-svn: 263942
* [IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSASilviu Baranga2016-03-211-0/+1
| | | | | | | | | | | | | | | | | | | Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941
* [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext ↵Silviu Baranga2016-03-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935
* [NVPTX] Adds a new address space inference pass.Jingyue Wu2016-03-205-9/+609
| | | | | | | | | | | | | | | | | | | Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916
* [X86][SSE] Tidyup setTargetShuffleZeroElements to match ↵Simon Pilgrim2016-03-201-4/+4
| | | | | | | | computeZeroableShuffleElements Based on feedback for D14261 llvm-svn: 263911
* [X86][SSE] Detect zeroable shuffle elements from different value typesSimon Pilgrim2016-03-201-8/+42
| | | | | | | | Improve computeZeroableShuffleElements to be able to peek through bitcasts to extract zero/undef values from BUILD_VECTOR nodes of different element sizes to the shuffle mask. Differential Revision: http://reviews.llvm.org/D14261 llvm-svn: 263906
* AVX512BW: Enable v32i1/v64i1 BUILD_VECTORIgor Breger2016-03-201-0/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D18211 llvm-svn: 263898
* Suppress a -Wunused-variable warning in release builds.Craig Topper2016-03-201-0/+1
| | | | llvm-svn: 263892
* Use a range-based for loop. NFC.Michael Kuperstein2016-03-201-4/+4
| | | | llvm-svn: 263889
* Expose IRBuilder::CreateAtomicCmpXchg as LLVMBuildAtomicCmpXchg in the C API.Mehdi Amini2016-03-191-0/+55
| | | | | | | | | | | | | Summary: Also expose getters and setters in the C API, so that the change can be tested. Reviewers: nhaehnle, axw, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18260 From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> llvm-svn: 263886
* CodeGen: use range based for loopSaleem Abdulrasool2016-03-191-5/+4
| | | | | | Convert a loop to use a range based style loop. NFC. llvm-svn: 263884
* [SimplifyLibCalls] Only consider sinpi/cospi functions within the same functionDavid Majnemer2016-03-191-7/+11
| | | | | | | | | | The sinpi/cospi can be replaced with sincospi to remove unnecessary computations. However, we need to make sure that the calls are within the same function! This fixes PR26993. llvm-svn: 263875
* [InstCombine] Don't insert instructions before a catch switchDavid Majnemer2016-03-191-0/+3
| | | | | | | | | CatchSwitches are not splittable, we cannot insert casts, etc. before them. This fixes PR26992. llvm-svn: 263874
* Add a comment on partial hashing of MetadataMehdi Amini2016-03-191-0/+12
| | | | | | | Following r263866, on D. Blaikie suggestion. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263869
* Hash Metadata using pointer for MDString argument instead of value (NFC)Mehdi Amini2016-03-192-140/+127
| | | | | | | | | | | MDString are uniqued in the Context on creation, hashing the pointer is less expensive than hashing the String itself. Reviewers: dexonsmith Differential Revision: http://reviews.llvm.org/D16560 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263867
* Compute some Debug Info Metadata hash key partially (NFC)Mehdi Amini2016-03-191-9/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch changes the computation of the hash key for DISubprogram to be computed on a small subset of the fields. The hash is computed a lot faster, but there might be more collision in the table. However by carefully selecting the fields, colisions should be rare. Using `opt` to load the IR for FastISelEmitter.cpp.o, with this patch: - DISubprogram::getImpl() goes from 28ms to 15ms. - DICompositeType::getImpl() goes from 6ms to 2ms - DIDerivedType::getImpl() goes from 18 to 12ms Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16571 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263866
OpenPOWER on IntegriCloud