summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardwareSimon Pilgrim2016-03-226-77/+210
| | | | | | | | | | | | | | Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062
* [unittests] clang-format a line, NFCVedant Kumar2016-03-221-3/+1
| | | | llvm-svn: 264059
* [mips] Make simm6 consistent with the rest. NFC.Daniel Sanders2016-03-221-5/+1
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18147 llvm-svn: 264057
* [mips] Range check simm7.Daniel Sanders2016-03-2210-23/+63
| | | | | | | | | | | | | | Summary: Also renamed li_simm7 to li16_imm since it's not a simm7 and has an unusual encoding (it's a uimm7 except that 0x7f represents -1). Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18145 llvm-svn: 264056
* [mips] Range check simm5.Daniel Sanders2016-03-224-3/+10
| | | | | | | | | | | | | | | Summary: We can't check the error message for this one because there's another lw/sw available that covers a larger range. We therefore check the transition between the two sizes. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18144 llvm-svn: 264054
* [mips] Range check vsplat_uimm[1234568].Daniel Sanders2016-03-223-72/+211
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18143 llvm-svn: 264053
* [mips] Range check uimm4_ptr, remove uimm6_ptr, and use correctly sized ↵Daniel Sanders2016-03-223-42/+74
| | | | | | | | | | | | immediates in MSA copy/insert. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18142 llvm-svn: 264052
* [PATCH] Force LoopReroll to reset the loop trip count value after reroll.Zinovy Nis2016-03-222-6/+9
| | | | | | | | | | | It's a bug fix. For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes. My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested. I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev. Differential Revision: http://reviews.llvm.org/D18316 llvm-svn: 264051
* [ELF][gcc compatibility]: support section names with special characters ↵Marina Yatsina2016-03-222-8/+13
| | | | | | | | | | | | (e.g. "/") Adding support for section names with special characters in them (e.g. "/"). GCC successfully compiles such section names. This also fixes PR24520. Differential Revision: http://reviews.llvm.org/D15678 llvm-svn: 264038
* Fix unittests: resize() -> reserve()Mehdi Amini2016-03-221-1/+1
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264029
* Rename DenseMap::resize() into DenseMap::reserve() (NFC)Mehdi Amini2016-03-223-3/+3
| | | | | | | This is more coherent with usual containers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264026
* Minor code cleanup. NFC.Junmo Park2016-03-221-1/+1
| | | | llvm-svn: 264024
* Appease the windows buildbotsSanjoy Das2016-03-221-30/+31
| | | | | | | The guess is that the stdout/stderr ordering may differ between windows / unix. llvm-svn: 264019
* Add "first class" lowering for deopt operand bundlesSanjoy Das2016-03-226-24/+204
| | | | | | | | | | | | | | | | | Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015
* [sancov] do not instrument nodes that are full pre-dominatorsMike Aizatsky2016-03-211-10/+17
| | | | | | | | | | | | | Summary: Without tree pruning clang has 2,667,552 points. Wiht only dominators pruning: 1,515,586. With both dominators & predominators pruning: 1,340,534. Resubmit of r262103. Differential Revision: http://reviews.llvm.org/D18341 llvm-svn: 264003
* [CUDA] Add documentation explaining how to detect clang vs nvcc.Justin Lebar2016-03-211-0/+28
| | | | llvm-svn: 264002
* AMDGPU: Fix dangling references introduced by r263982Nicolai Haehnle2016-03-211-3/+5
| | | | | | | Fixes Valgrind errors on the test cases that were reported as failing by buildbots. llvm-svn: 264000
* [InstCombine] Ensure all undef operands are handled before binary ↵Simon Pilgrim2016-03-211-2/+16
| | | | | | | | | | instruction constant folding As noted in PR18355, this patch makes it clear that all cases with undef operands have been handled before further constant folding is attempted. Differential Revision: http://reviews.llvm.org/D18305 llvm-svn: 263994
* Fix -Wdocumentation warnings from r263853Duncan P. N. Exon Smith2016-03-211-4/+7
| | | | | | Thanks to chapuni for catching this. llvm-svn: 263993
* [MemorySSA] Consider def-only BBs for live-in calculations.George Burgess IV2016-03-212-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | If we have a BB with only MemoryDefs, live-in calculations will ignore it. This means we get results like this: define void @foo(i8* %p) { ; 1 = MemoryDef(liveOnEntry) store i8 0, i8* %p br i1 undef, label %if.then, label %if.end if.then: ; 2 = MemoryDef(1) store i8 1, i8* %p br label %if.end if.end: ; 3 = MemoryDef(1) store i8 2, i8* %p ret void } ...When there should be a MemoryPhi in the `if.end` BB. This patch fixes that behavior. llvm-svn: 263991
* Remove leftover options from multiline.llKrzysztof Parzyszek2016-03-211-2/+2
| | | | | | | I added -march=hexagon to force using Hexagon target when testing locally, and I forgot to take it out. llvm-svn: 263990
* Add a testcase that would have found the bug in r263971.Rafael Espindola2016-03-212-0/+12
| | | | llvm-svn: 263988
* Revert "[llvm-objdump] Printing relocations in executable and shared object ↵Rafael Espindola2016-03-215-23/+20
| | | | | | | | | files. This partially reverts r215844 by removing test objdump-reloc-shared.test which stated GNU objdump doesn't print relocations, it does." This reverts commit r263971. It produces the wrong results for .rela.dyn. I will add a test. llvm-svn: 263987
* Unxfail test/DebugInfo/Generic/multiline.ll on HexagonKrzysztof Parzyszek2016-03-211-3/+2
| | | | llvm-svn: 263986
* AMDGPU: Coding style fixesNicolai Haehnle2016-03-211-4/+2
| | | | | | | I meant to add these before committing r263982 as per the review, but I forgot to squash. llvm-svn: 263983
* AMDGPU: Add SIWholeQuadMode passNicolai Haehnle2016-03-2110-15/+863
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982
* [Hexagon] Add handling fixups and instruction relaxationKrzysztof Parzyszek2016-03-212-112/+476
| | | | llvm-svn: 263981
* [Hexagon] Properly encode registers in duplex instructionsKrzysztof Parzyszek2016-03-214-6/+136
| | | | llvm-svn: 263980
* [Hexagon] Fix reserving emergency spill slots for register scavengerKrzysztof Parzyszek2016-03-213-35/+11
| | | | | | | - R10 and R11 are not reserved registers. - Check for reserved registers when finding unused caller-saved registers. llvm-svn: 263977
* [WebAssembly] Implement the eqz instructions.Dan Gohman2016-03-213-0/+29
| | | | llvm-svn: 263976
* [SLP] Remove unnecessary member variables by using container APIs.Chad Rosier2016-03-211-13/+6
| | | | | | | This changes the debug output, but still retains its usefulness. Differential Revision: http://reviews.llvm.org/D18324 llvm-svn: 263975
* [llvm-objdump] Printing relocations in executable and shared object files. ↵Colin LeMahieu2016-03-215-20/+23
| | | | | | | | | | This partially reverts r215844 by removing test objdump-reloc-shared.test which stated GNU objdump doesn't print relocations, it does. In executable and shared object ELF files, relocations in the file contain the final virtual address rather than section offset so this is adjusted to display section offset. Differential revision: http://reviews.llvm.org/D15965 llvm-svn: 263971
* AMDGPU/SI: Fix threshold calculation for branching when exec is zeroTom Stellard2016-03-212-3/+39
| | | | | | | | | | | | | | | | | | | Summary: When control flow is implemented using the exec mask, the compiler will insert branch instructions to skip over the masked section when exec is zero if the section contains more than a certain number of instructions. The previous code would only count instructions in successor blocks, and this patch modifies the code to start counting instructions in all blocks between the start and end of the branch. Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18282 llvm-svn: 263969
* [AArch64] Add a helpful assert. NFC.Chad Rosier2016-03-211-0/+1
| | | | llvm-svn: 263965
* AMDGPU: Remove SignBitIsZero for mubuf scratch offsetsMatt Arsenault2016-03-213-12/+9
| | | | | | | These instructions do not have the same negative base address problem that DS instructions do on SI. llvm-svn: 263964
* ARM: Better codegen for 64-bit compares.Peter Collingbourne2016-03-215-144/+238
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962
* [ARM] Add Cortex-A32 supportRenato Golin2016-03-214-2/+46
| | | | | | | | Adding Cortex-A32 as an available target in the ARM backend. Patch by Sam Parker. llvm-svn: 263956
* [llvm-readobj] Impl GNU style symbols printingHemant Kulkarni2016-03-213-40/+220
| | | | | | | | Implements "readelf -sW and readelf -DsW" Differential Revision: http://reviews.llvm.org/D18224 llvm-svn: 263952
* [Orc] Switch RPC Procedure to take a function type, rather than an arg list.Lang Hames2016-03-213-64/+58
| | | | | | No functional change, just a little more readable. llvm-svn: 263951
* APFloat: Add frexpMatt Arsenault2016-03-213-2/+161
| | | | llvm-svn: 263950
* AMDGPU: Add frexp_mant intrinsicMatt Arsenault2016-03-213-2/+70
| | | | llvm-svn: 263948
* Implement constant folding for bitreverseMatt Arsenault2016-03-215-2/+166
| | | | llvm-svn: 263945
* [AArch64] Fix a -Wdocumentation warning. NFC.Chad Rosier2016-03-211-2/+2
| | | | llvm-svn: 263942
* [IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSASilviu Baranga2016-03-212-0/+61
| | | | | | | | | | | | | | | | | | | Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941
* [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext ↵Silviu Baranga2016-03-212-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935
* Fixed -mcpu flagElena Demikhovsky2016-03-211-1/+1
| | | | | | "core-avx" does not exist; I changed to "nehalem" llvm-svn: 263932
* [X86][SSE] Add vector integer division by constant testsSimon Pilgrim2016-03-205-1241/+4954
| | | | | | Expanded tests and split into sdiv/srem and udiv/urem cases for 128 and 256 bit vectors. llvm-svn: 263917
* [NVPTX] Adds a new address space inference pass.Jingyue Wu2016-03-206-19/+678
| | | | | | | | | | | | | | | | | | | Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916
* [gold] Emit a diagnostic in case we fail to remove a file.Davide Italiano2016-03-201-2/+6
| | | | llvm-svn: 263914
* [X86][SSE] Tidyup setTargetShuffleZeroElements to match ↵Simon Pilgrim2016-03-201-4/+4
| | | | | | | | computeZeroableShuffleElements Based on feedback for D14261 llvm-svn: 263911
OpenPOWER on IntegriCloud