summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* ARM: maintain BB ordering when expanding WIN__DBZCHKSaleem Abdulrasool2016-03-251-1/+1
| | | | | | | | | | | | | It is possible to have a fallthrough MBB prior to MBB placement. The original addition of the BB would result in reordering the BB as not preceding the successor. Because of the fallthrough nature of the BB, we could end up executing incorrect code or even a constant pool island! Insert the spliced BB into the same location to avoid that. Thanks to Tim Northover for invaluable hints and Fiora for the discussion on what may have been occurring! llvm-svn: 264454
* AMDGPU: Fix a use-after free and a missing breakJustin Bogner2016-03-251-1/+2
| | | | | | | | | | | | | | | | We're erasing MI here, but then immediately using it again inside the `if`. This moves the erase after we're done using it. Doing that reveals a second problem though - this case is missing a break, so we fall through to the default and dereference MI again. This is obviously a bug, though I don't know how to write a test that triggers it - all we do in the error case is print some extra debug output. Both of these issue crash on lots of tests under ASAN with the recycling allocator changes from PR26808 applied. llvm-svn: 264442
* [X86] Use "and $0" and "orl $-1" to store 0 and -1 when optimizing for minsizeHans Wennborg2016-03-251-0/+12
| | | | | | | | | | | | 64-bit, 32-bit and 16-bit move-immediate instructions are 7, 6, and 5 bytes, respectively, whereas and/or with 8-bit immediate is only three bytes. Since these instructions imply an additional memory read (which the CPU could elide, but we don't think it does), restrict these patterns to minsize functions. Differential Revision: http://reviews.llvm.org/D18374 llvm-svn: 264440
* [SystemZ] Remove isBranch and isTerminator flags on BRCT and BRCTG.Jonas Paulsson2016-03-251-1/+1
| | | | | | | | The BranchUnaryRI instruction class already sets these flags. Reviewed by Ulrich Weigand. llvm-svn: 264411
* [AArch64] Fix typo. NFC.Chad Rosier2016-03-251-2/+1
| | | | llvm-svn: 264408
* [X86][SSE] Don't duplicate Lower256IntArith functionality in LowerShift. NFC.Simon Pilgrim2016-03-251-20/+2
| | | | | | LowerShift was using the same code as Lower256IntArith to split 256-bit vectors into 2 x 128-bit vectors, so now we just call Lower256IntArith. llvm-svn: 264403
* fixed typoElena Demikhovsky2016-03-251-1/+1
| | | | llvm-svn: 264395
* AMDGPU: Cost model for basic integer operationsMatt Arsenault2016-03-251-0/+31
| | | | | | | This resolves bug 21148 by preventing promotion to i64 induction variables. llvm-svn: 264376
* X86: Use push-pop for materializing 8-bit immediates for minsize (take 2)Hans Wennborg2016-03-257-4/+93
| | | | | | | | | This is the same as r255936, with added logic for avoiding clobbering of the red zone (PR26023). Differential Revision: http://reviews.llvm.org/D18246 llvm-svn: 264375
* AMDGPU: Partially implement getArithmeticInstrCost for FP opsMatt Arsenault2016-03-252-1/+94
| | | | llvm-svn: 264374
* IR: Reserve an MDKind for !llvm.loop; NFCDuncan P. N. Exon Smith2016-03-251-1/+2
| | | | | | | | | | This reserves an MDKind for !llvm.loop, which allows callers to avoid a string-based lookup. I'm not sure why it was missing. There should be no functionality change here, just a small compile-time speedup. llvm-svn: 264371
* ARM: fix optimised division on WoASaleem Abdulrasool2016-03-251-0/+1
| | | | | | | | | We did not have an explicit branch to the continuation BB. When the check was hoisted, this could permit control follow to fall through into the division trap. Add the explicit branch to the continuation basic block to ensure that code execution is correct. llvm-svn: 264370
* AMDGPU: TTI: Make insertelement free.Matt Arsenault2016-03-251-0/+5
| | | | | | We don't want to have a cost to scalarizing operations. llvm-svn: 264364
* Finish the incomplete 'd' inline asm constraint support for PPC byEric Christopher2016-03-241-0/+5
| | | | | | making sure we give it a register and mark it as a register constraint. llvm-svn: 264340
* [Hexagon] Be sure to treat subregisters of a CSR as CSRs as wellKrzysztof Parzyszek2016-03-241-5/+8
| | | | llvm-svn: 264331
* [Hexagon] Add support for run-time stack overflow checkingKrzysztof Parzyszek2016-03-244-13/+64
| | | | | | Patch by Sundeep Kushwaha. llvm-svn: 264328
* [Hexagon] Generate PIC-specific versions of save/restore routinesKrzysztof Parzyszek2016-03-243-9/+43
| | | | | | | | | | | | In PIC mode, the registers R14, R15 and R28 are reserved for use by the PLT handling code. This causes all functions to clobber these registers. While this is not new for regular function calls, it does also apply to save/restore functions, which do not follow the standard ABI conventions with respect to the volatile/non-volatile registers. Patch by Jyotsna Verma. llvm-svn: 264324
* [MC][mips] Add MipsMCInstrAnalysis class and register it as MC instruction ↵Simon Atanasyan2016-03-241-0/+36
| | | | | | | | | | | | | analyzer The `MipsMCInstrAnalysis` class overrides the `evaluateBranch` method and calculates target addresses for branch and calls instructions. That allows llvm-objdump to print functions' names in branch instructions in the disassemble mode. Differential Revision: http://reviews.llvm.org/D18209 llvm-svn: 264309
* [X86][XOP] Fixed instruction postfixes to more closely match operandsSimon Pilgrim2016-03-242-91/+91
| | | | | | Suggested by Sanjay in D18189 as the multiple folding options in XOP instructions can be tricky llvm-svn: 264305
* AVX-512: Generate KTEST instead of TEST fir i1 vectorsElena Demikhovsky2016-03-241-5/+27
| | | | | | | | | | | | KTEST instruction may be used instead of TEST in this case: %int_sel3 = bitcast <8 x i1> %sel3 to i8 %res = icmp eq i8 %int_sel3, zeroinitializer br i1 %res, label %L2, label %L1 Differential Revision: http://reviews.llvm.org/D18444 llvm-svn: 264298
* CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS.Tim Northover2016-03-241-0/+4
| | | | | | | | | | | | | If the operation's type has been promoted during type legalization, we need to account for the fact that the high bits of the comparison operand are likely unspecified. The LHS is usually zero-extended, but MIPS sign extends it, so we have to be slightly careful. Patch by Simon Dardis. llvm-svn: 264296
* AMDGPU/SI: Add Polaris supportTom Stellard2016-03-241-0/+8
| | | | | | Patch By: Sonny Jiang llvm-svn: 264295
* [X86][XOP] Merged 128/256 bit 4op instruction definitions. NFCI.Simon Pilgrim2016-03-241-15/+14
| | | | llvm-svn: 264294
* [mips] Range check vsplat_simm5 and vsplat_simm10Daniel Sanders2016-03-242-5/+6
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18177 llvm-svn: 264287
* [PowerPC] Disable direct moves for extractelement and bitcast in 32-bit modeNemanja Ivanovic2016-03-241-2/+2
| | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D17711 It disables direct moves on these operations in 32-bit mode since the patterns assume 64-bit registers. The final patch is slightly different from the Phabricator review as the bitcast operations needed to be disabled in 32-bit mode as well. This fixes PR26617. llvm-svn: 264282
* [mips] Range check simm10Daniel Sanders2016-03-243-4/+13
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18148 llvm-svn: 264279
* [X86][XOP] Support for VPPERM byte shuffle instructionSimon Pilgrim2016-03-245-3/+49
| | | | | | | | This patch begins adding support for lowering to the XOP VPPERM instruction - adding the X86ISD::VPPERM opcode. Differential Revision: http://reviews.llvm.org/D18189 llvm-svn: 264260
* [mips] Tidy up cnMIPS tablegen definitions. NFC.Daniel Sanders2016-03-242-51/+58
| | | | | | | | | | | | | | | | | | | Summary: In particular, make the cnMIPS predicates much more obvious and prefer def ... : ... { let Foo = bar; } over: let Foo = bar in def ... : ...; Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18354 llvm-svn: 264258
* Fix sequence point warning. NFC.Vasileios Kalintiris2016-03-241-1/+1
| | | | llvm-svn: 264255
* [mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD, ↵Zlatko Buljan2016-03-249-42/+89
| | | | | | | | DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D17137 llvm-svn: 264248
* [mips][microMIPS] Implement MTC*, MTHC* and DMTC* instructionsHrvoje Varga2016-03-249-11/+192
| | | | | | Differential Revision: http://reviews.llvm.org/D17328 llvm-svn: 264246
* [mips][microMIPS] Fix for "Cannot copy registers" assertionHrvoje Varga2016-03-243-9/+17
| | | | | | Differential Revision: http://reviews.llvm.org/D17068 llvm-svn: 264245
* [PS4] Guarantee an instruction after a 'noreturn' call.Paul Robinson2016-03-241-1/+3
| | | | | | | | | | We need the "return address" of a noreturn call to be within the bounds of the calling function; TrapUnreachable turns 'unreachable' into a 'ud2' instruction, which has that desired effect. Differential Revision: http://reviews.llvm.org/D18414 llvm-svn: 264224
* AMDGPU: Remove atomic inc/dec patternsMatt Arsenault2016-03-231-23/+0
| | | | | | | There is no benefit to these since materializing the constant 1 requires the same number of instructions as materializing uint_max llvm-svn: 264215
* AMDGPU: Promote alloca should skip volatilesMatt Arsenault2016-03-231-0/+13
| | | | llvm-svn: 264214
* AMDGPU: Insert moves of frame index to value operandsMatt Arsenault2016-03-231-0/+56
| | | | | | | | | | | | | | | | | | | | | | | Strengthen tests of storing frame indices. Right now this just creates irrelevant scheduling changes. We don't want to have multiple frame index operands on an instruction. There seem to be various assumptions that at least the same frame index will not appear twice in the LocalStackSlotAllocation pass. There's no reason to have this happen, and it just makes it easy to introduce bugs where the immediate offset is appplied to the storing instruction when it should really be applied to the value being stored as a separate add. This might not be sufficient. It might still be problematic to have an add fi, fi situation, but that's even less unlikely to happen in real code. llvm-svn: 264200
* Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.Cong Hou2016-03-232-74/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 264199
* [x86] make peekThroughBitcasts() a helper functionSanjay Patel2016-03-231-60/+31
| | | | | | | | | | This should be hoisted further up so it can be used in DAGCombiner and other backends, but I'm limiting the scope in the interest of patch minimalism. It's not quite NFC because some of the replaced code was using an 'if' check rather than a 'while' loop, so those cases would only look through a single bitcast. llvm-svn: 264186
* [AArch64] Replace return 0 with return false. NFC.Chad Rosier2016-03-231-3/+3
| | | | llvm-svn: 264185
* Codegen: [PPC] Word Rotates are Zero Extending.Kyle Butt2016-03-231-1/+8
| | | | | | | Add Word rotates to the list of instructions that are zero extending. This allows them to be used in dot form to compare with zero. llvm-svn: 264183
* Replace a string comparison in ARMSubtarget.h with a tablegen entry in ↵Artyom Skrobov2016-03-232-5/+8
| | | | | | | | | | | | ARM.td (NFC) Reviewers: rengolin, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D18393 llvm-svn: 264165
* [AArch64] Replace some uses of report_fatal_error with reportError in ↵Oliver Stannard2016-03-231-17/+35
| | | | | | | | | | | | AArch64 ELF object writer If we can't handle a relocation type, report it as an error in the source, rather than asserting. I've added a more descriptive message and a test for the only cases of this that I've been able to trigger. Differential Revision: http://reviews.llvm.org/D18388 llvm-svn: 264156
* [X86] Introduction of FeatureX87.Andrey Turetskiy2016-03-234-69/+100
| | | | | | | | Add FeatureX87 in X86 backend to be able to define CPUs which doesn't have x87. Differential Revision: http://reviews.llvm.org/D13979 llvm-svn: 264148
* [mips][microMIPS] Delay slot filler modificationsHrvoje Varga2016-03-231-0/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D18181 llvm-svn: 264147
* [AMDGPU] Fix missing assembler predicates.Valery Pykhtin2016-03-231-0/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D18351 llvm-svn: 264137
* AMDGPU: Cache information about register pressure setsTom Stellard2016-03-232-24/+37
| | | | | | | | | | We can statically decide whether or not a register pressure set is for SGPRs or VGPRs, so we don't need to re-compute this information in SIRegisterInfo::getRegPressureSetLimit(). Differential Revision: http://reviews.llvm.org/D14805 llvm-svn: 264126
* TypoJoerg Sonnenberger2016-03-221-1/+1
| | | | llvm-svn: 264110
* [WebAssembly] Implement the rotate instructions.Dan Gohman2016-03-222-1/+9
| | | | llvm-svn: 264076
* [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardwareSimon Pilgrim2016-03-221-1/+3
| | | | | | | | | | | | | | Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062
* [mips] Make simm6 consistent with the rest. NFC.Daniel Sanders2016-03-221-5/+1
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18147 llvm-svn: 264057
OpenPOWER on IntegriCloud