summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix incorrect redundant expression in target AMDGPU.Etienne Bergeron2016-04-251-1/+1
| | | | | | | | | | | | | | | | | | | Summary: The expression is detected as a redundant expression. Turn out, this is probably a bug. ``` /home/etienneb/llvm/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:306:26: warning: both side of operator are equivalent [misc-redundant-expression] if (isSMRD(*FirstLdSt) && isSMRD(*FirstLdSt)) { ``` Reviewers: rnk, tstellarAMD Subscribers: arsenm, cfe-commits Differential Revision: http://reviews.llvm.org/D19460 llvm-svn: 267415
* [ARM] Add support for the X asm constraintSilviu Baranga2016-04-252-0/+22
| | | | | | | | | | | | | | | | | | Summary: This patch adds support for the X asm constraint. To do this, we lower the constraint to either a "w" or "r" constraint depending on the operand type (both constraints are supported on ARM). Fixes PR26493 Reviewers: t.p.northover, echristo, rengolin Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D19061 llvm-svn: 267411
* [AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax.Artem Tamazov2016-04-255-3/+127
| | | | | | | | | | | | | Added hwreg(reg[,offset,width]) syntax. Default offset = 0, default width = 32. Possibility to specify 16-bit immediate kept. Added out-of-range checks. Disassembling is always to hwreg(...) format. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19329 llvm-svn: 267410
* [Hexagon] Correctly set "Flags" in ELF headerKrzysztof Parzyszek2016-04-251-3/+7
| | | | llvm-svn: 267397
* [PowerPC] [PR27387] Disallow r0 for ADD8TLS.Marcin Koscielnicki2016-04-251-2/+4
| | | | | | | | | | | ADD8TLS, a variant of add instruction used for initial-exec TLS, currently accepts r0 as a source register. While add itself supports r0 just fine, linker can relax it to a local-exec sequence, converting it to addi - which doesn't support r0. Differential Revision: http://reviews.llvm.org/D19193 llvm-svn: 267388
* [X86] Replace a SmallVector used to pass 2 values to an ArrayRef parameter ↵Craig Topper2016-04-251-3/+1
| | | | | | with a fixed size array. NFC llvm-svn: 267377
* Minor code cleanups. NFC.Junmo Park2016-04-254-23/+23
| | | | llvm-svn: 267375
* ARM: fix __chkstk Frame Setup on WoASaleem Abdulrasool2016-04-242-2/+4
| | | | | | | | | | | | This corrects the MI annotations for the stack adjustment following the __chkstk invocation. We were marking the original SP usage as a Def rather than Kill. The (new) assigned value is the definition, the original reference is killed. Adjust the ISelLowering to mark Kills and FrameSetup as well. This partially resolves PR27480. llvm-svn: 267361
* Fix typo in comment. NFCNick Lewycky2016-04-241-1/+1
| | | | llvm-svn: 267354
* [X86][SSE] getTargetShuffleMaskIndices - dropped (unused) UNDEF handlingSimon Pilgrim2016-04-241-5/+0
| | | | | | We aren't currently making use of this in any successful mask decode and its actually incorrect as it inserts the wrong number of SM_SentinelUndef mask elements. llvm-svn: 267350
* [X86][SSE] Use range loop. NFCI.Simon Pilgrim2016-04-241-3/+2
| | | | llvm-svn: 267349
* [Lanai] Use EVT::getEVTString() to print a type as a string instead of an ↵Craig Topper2016-04-241-1/+1
| | | | | | enum encoding value. llvm-svn: 267348
* [X86][XOP] Fixed VPPERM permute op decoding (PR27472).Simon Pilgrim2016-04-241-1/+1
| | | | | | Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask. llvm-svn: 267346
* [X86][SSE] Improved support for decoding target shuffle masks through bitcastsSimon Pilgrim2016-04-241-20/+26
| | | | | | | | Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm. This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472. llvm-svn: 267343
* [SystemZ] [SSP] Add support for LOAD_STACK_GUARD.Marcin Koscielnicki2016-04-244-0/+52
| | | | | | | | | | | | | | This fixes PR22248 on s390x. The previous attempt at this was D19101, which was before LOAD_STACK_GUARD existed. Compared to the previous version, this always emits a rather ugly block of 4 instructions, involving a thread pointer load that can't be shared with other potential users. However, this is necessary for SSP - spilling the guard value (or thread pointer used to load it) is counter to the goal, since it could be overwritten along with the frame it protects. Differential Revision: http://reviews.llvm.org/D19363 llvm-svn: 267340
* [X86] Merge LowerCTLZ and LowerCTLZ_ZERO_UNDEF into a single function that ↵Craig Topper2016-04-241-38/+16
| | | | | | branches internally for the one difference, allowing the rest of the code to be common. NFC llvm-svn: 267331
* [X86] Node need to check if AVX512 is supported when lowering vector CTLZ. ↵Craig Topper2016-04-241-7/+5
| | | | | | The CTLZ operation is only Custom for vectors if AVX512 is enabled so if a vector gets here AVX512 is implied. NFC llvm-svn: 267330
* [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)Gerolf Hoflehner2016-04-244-35/+557
| | | | | | | | | | | | | | | | | | | The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328
* [X86] Remove isel patterns for selecting tzcnt/lzcnt from ↵Craig Topper2016-04-241-80/+0
| | | | | | cmove/ne+cttz/ctlz. These are folded by DAG combine now. llvm-svn: 267326
* Fix an assertion that can never fire because the condition ANDed with the ↵Craig Topper2016-04-241-1/+1
| | | | | | string is just true or 1. llvm-svn: 267324
* Fix a couple assertions that can never fire because they just contained the ↵Craig Topper2016-04-242-2/+2
| | | | | | text string which always evaluates to true. Add a ! so they'll evaluate to false. llvm-svn: 267312
* [X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt ↵Craig Topper2016-04-241-30/+24
| | | | | | instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly. llvm-svn: 267311
* [MC/ELF] Implement support for GOTPCRELX/REX_GOTPCRELX.Davide Italiano2016-04-242-5/+25
| | | | | | | | | The option to control the emission of the new relocations is -relax-relocations (blatantly copied from GNU as). It can't be enabled by default because it breaks relatively recent versions of ld.bfd/ld.gold (late 2015). llvm-svn: 267307
* [MC/ELF] Pass Fixup to getRelocType64.Davide Italiano2016-04-231-2/+3
| | | | | | In preparation for other changes. llvm-svn: 267300
* Revert "[AArch64] Fix optimizeCondBranch logic."Renato Golin2016-04-231-5/+5
| | | | | | This reverts commit r267206, as it broke self-hosting on AArch64. llvm-svn: 267294
* [Hexagon] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will ↵Craig Topper2016-04-233-16/+8
| | | | | | convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC llvm-svn: 267266
* [NVPTX] Set ctlz_zero_undef to Expand so LegalizeDAG will convert it to ↵Craig Topper2016-04-232-10/+3
| | | | | | ctlz. Remove the now unneccessary isel patterns. NFC llvm-svn: 267265
* [WebAssembly] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG ↵Craig Topper2016-04-232-7/+1
| | | | | | will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC llvm-svn: 267264
* AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.SizeMatt Arsenault2016-04-221-0/+16
| | | | llvm-svn: 267244
* AMDGPU: Re-visit nodes in performAndCombineMatt Arsenault2016-04-221-0/+5
| | | | | | This fixes test regressions when i64 loads/stores are made promote. llvm-svn: 267240
* Differential Revision: http://reviews.llvm.org/D19040Sriraman Tallam2016-04-222-4/+19
| | | | llvm-svn: 267229
* CodeGen: Use PLT relocations for relative references to unnamed_addr functions.Peter Collingbourne2016-04-223-40/+23
| | | | | | | | | | | | | The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual functions defined in other DSOs. The unnamed_addr attribute means that the function's address is not significant, so we're allowed to substitute it with the address of a PLT entry. Also includes a bonus feature: addends for COFF image-relative references. Differential Revision: http://reviews.llvm.org/D17938 llvm-svn: 267211
* [AArch64] Fix optimizeCondBranch logic.Quentin Colombet2016-04-221-5/+5
| | | | | | | | | | | | | The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267206
* [AArch64] When creating MRS instruction, make sure the destination register isQuentin Colombet2016-04-221-2/+1
| | | | | | | | declared as a definition. This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll. llvm-svn: 267185
* [AArch64][AdvSIMDScalar] Update the kill flags correctly.Quentin Colombet2016-04-221-30/+45
| | | | | | | | | | | | | | | | | | | We used to simply set the kill flags to true when transforming a scalar instruction to a vector one. SrcScalar1 = copy SrcVector1 ... = opScalar SrcScalar1 => SrcScalar1 = copy SrcVector1 ... = opVector SrcVector1<kill> This is obviously wrong. The proper update consists in: 1. Propagate the kill status from the copy to the new opVector 2. Reset the kill status on the copy, since the live-range of SrcVector1 got extended. This fixes some of the machine verifier errors for AArch64 with make check. llvm-svn: 267180
* [Hexagon] Use common Pat classes for selecting code for intrinsicsKrzysztof Parzyszek2016-04-224-332/+361
| | | | llvm-svn: 267178
* [Hexagon] Properly close live range in HexagonBlockRangesKrzysztof Parzyszek2016-04-221-1/+1
| | | | llvm-svn: 267173
* [AMDGPU] Insert nop pass: take care of outstanding feedbackKonstantin Zhuravlyov2016-04-222-21/+18
| | | | | | | | | | | - Switch few loops to range-based for loops - Fix nop insertion at the end of BB - Fix formatting - Check for endpgm Differential Revision: http://reviews.llvm.org/D19380 llvm-svn: 267167
* [mips][microMIPS] Revert commit r266861.Zoran Jovanovic2016-04-2210-170/+8
| | | | | | Commit r266861 was the reason for failing tests in LLVM test suite. llvm-svn: 267166
* [Hexagon] Teach mux expansion how to deal with undef predicatesKrzysztof Parzyszek2016-04-221-5/+13
| | | | llvm-svn: 267165
* [Hexagon] Add definitions for trap/pause instructionsKrzysztof Parzyszek2016-04-221-0/+21
| | | | | | Also add tests for other instructions from HexagonSystemInst.td. llvm-svn: 267162
* Emit code16 in assembly in 16-bit modeNirav Dave2016-04-221-0/+6
| | | | | | | | | | | | | | | Summary: When generating assembly using -m16 we must explicitly mark it as 16-bit. Emit .code16 at beginning of file. Fixes wrong results when using -fno-integrated-as. Reviewers: dwmw2 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19392 llvm-svn: 267152
* [mips] Fix select patterns for MIPS64Simon Dardis2016-04-221-1/+1
| | | | | | | | | | | | | | When targetting MIPS64R6 some of the patterns for select were guarded by a broken predicate. The predicate was supposed to test if a constant value could fit in a 16 bit zero-extended field. Instead the value was tested to fit in a 16 bit sign-extended field. For negative constants of native word width this resulted in wrong code generation. Reviewers: vkalintiris, dsanders Differential Review: http://reviews.llvm.org/D19378 llvm-svn: 267151
* [mips] Fix a small typo that would leave BLTZC out of getAnalyzableBrOpc().'Vasileios Kalintiris2016-04-221-1/+1
| | | | llvm-svn: 267149
* [mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructionsHrvoje Varga2016-04-224-3/+14
| | | | | | Differential Revision: http://reviews.llvm.org/D19354 llvm-svn: 267137
* [mips][microMIPS] Add R_MICROMIPS_PC18_S3 relocationZoran Jovanovic2016-04-224-2/+23
| | | | | | Differential Revision: http://reviews.llvm.org/D15026 llvm-svn: 267130
* Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64Daniel Sanders2016-04-224-557/+35
| | | | | | It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127
* [X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode.Ashutosh Nema2016-04-221-2/+0
| | | | | | | | | | | | | | Summary: rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS operations during DAG combine. This generates better code for truncate, so cost of truncate needs to be changed but looks like it got changed only in SSE2 table Whereas this change is also applicable for SSE4.1, so the cost of truncate needs to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from SSE4.1, so it will fall back to SSE2. Reviewers: Simon Pilgrim llvm-svn: 267123
* [Sparc] This provides support for itineraries on Sparc.Chris Dewhurst2016-04-225-159/+417
| | | | | | | | | | | | Specifically, itineraries for LEON processors has been added, along with several LEON processor Subtargets. Although currently all these targets are pretty much identical, support for features that will differ among these processors will be added in the very near future. The different Instruction Itinerary Classes (IICs) added are sufficient to differentiate between the instruction timings used by LEON and, quite probably, by generic Sparc processors too, but the focus of the exercise has been for LEON processors, as the requirement of my project. If the IICs are not sufficient for other Sparc processor types and you want to add a new itinerary for one of those, it should be relatively trivial to adapt this. As none of the LEON processors has Quad Floats, or is a Version 9 processor, none of those instructions have itinerary classes defined and revert to the default "NoItinerary" instruction itinerary. Phabricator Review: http://reviews.llvm.org/D19359 llvm-svn: 267121
* The following code would not work before this patch, due to the inability to ↵Chris Dewhurst2016-04-221-1/+1
| | | | | | | | | | | | | | | | | | | | | take the address of a global object: void func1() { ... } int main(int argc, char** argv) { void (*pFunc)(); pFunc = &func1 pFunc(); ... } Phabricator review: http://reviews.llvm.org/D19368 llvm-svn: 267120
OpenPOWER on IntegriCloud