summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* ARM: support TLS for WoASaleem Abdulrasool2016-02-035-0/+62
| | | | | | | | | | | Add support for TLS access for Windows on ARM. This generates a similar access to MSVC for ARM. The changes to the tablegen data is needed to support loading an external symbol global that is not for a call. The adjustments to the DAG to DAG transforms are needed to preserve the 32-bit move. llvm-svn: 259676
* [ARM] Move GNUEABI divmod to __aeabi_divmod*Renato Golin2016-02-031-2/+4
| | | | | | | | | | The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores which happens to be a lot faster than __divsi3/__modsi3 when the core has hardware divide instructions. Do the same here. Fixes PR26450. llvm-svn: 259657
* [mips] Remove redundant inclusions of MipsAnalyzeImmediate.hDaniel Sanders2016-02-039-8/+1
| | | | llvm-svn: 259655
* Fix for PR 26381Nemanja Ivanovic2016-02-031-1/+1
| | | | | | Simple fix - Constant values were not being sign extended in FastIsel. llvm-svn: 259645
* [mips] Add SHF_MIPS_GPREL flag to the MIPS .sbss and .sdata sectionsSimon Atanasyan2016-02-031-2/+4
| | | | | | | | | | MIPS ABI states that .sbss and .sdata sections must have SHF_MIPS_GPREL flag. See Figure 4–7 on page 69 in the following document: ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf. Differential Revision: http://reviews.llvm.org/D15740 llvm-svn: 259641
* [X86][AVX] Add support for 64-bit VZEXT_LOAD of 256/512-bit vectors to ↵Simon Pilgrim2016-02-034-124/+121
| | | | | | | | | | | | EltsFromConsecutiveLoads Follow up to D16217 and D16729 This change uncovered an odd pattern where VZEXT_LOAD v4i64 was being lowered to a load of the lower v2i64 (so the 2nd i64 destination element wasn't being zeroed), I can't find any use/reason for this and have removed the pattern and replaced it so only the 1st i64 element is loaded and the upper bits all zeroed. This matches the description for X86ISD::VZEXT_LOAD Differential Revision: http://reviews.llvm.org/D16768 llvm-svn: 259635
* Codegen: [PPC] Fix PPCVSXFMAMutate to handle duplicates.Kyle Butt2016-02-031-19/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of PPCVSXFMAMutate is to elide copies by changing FMA forms on PPC. %vreg6<def> = COPY %vreg96 %vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg7 ;v6 = v6 + v5 * v7 is replaced by %vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg7, %vreg96 ;v5 = v5 * v7 + v96 This was broken in the case where the target register was also used as a multiplicand. Fix this case by checking for it and replacing both uses with the copied register. %vreg6<def> = COPY %vreg96 %vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg6 ;v6 = v6 + v5 * v6 is replaced by %vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg96, %vreg96 ;v5 = v5 * v96 + v96 llvm-svn: 259617
* Revert r259576: Disable the vzeroupper insertion pass on PS4.Yunzhong Gao2016-02-031-3/+0
| | | | | | Will re-implement based on review feedback. llvm-svn: 259615
* Disable the vzeroupper insertion pass on PS4.Yunzhong Gao2016-02-021-0/+3
| | | | | | | | See comments in test/CodeGen/X86/avx-vzeroupper.ll for more explanation. Original patch by: Sean Silva llvm-svn: 259576
* AMDGPU: Do not promote allocas with non-inbounds GEPsMatt Arsenault2016-02-021-0/+7
| | | | | | | | If we can't assume the pointer value isn't within the bounds of the object, it seems risky to try to replace the pointer calculations. llvm-svn: 259573
* AMDGPU: Handle promoting memmoveMatt Arsenault2016-02-021-0/+24
| | | | | | Also add missing tests for the others. llvm-svn: 259558
* [X86] Fix the merging of SP updates in prologue/epilogue insertions.Quentin Colombet2016-02-021-2/+7
| | | | | | | | | When the merging was involving LEAs, we were taking the wrong immediate from the list of operands. rdar://problem/24446069 llvm-svn: 259553
* AMDGPU: Skip promote alloca with no optimizationsMatt Arsenault2016-02-022-2/+2
| | | | llvm-svn: 259551
* AMDGPU: Minor cleanups for AMDGPUPromoteAllocaMatt Arsenault2016-02-021-27/+21
| | | | | | Mostly convert to use range loops. llvm-svn: 259550
* AMDGPU: Report AMDGPUPromoteAlloca changed the functionMatt Arsenault2016-02-021-22/+21
| | | | llvm-svn: 259547
* AMDGPU: Whitelist handled intrinsicsMatt Arsenault2016-02-021-8/+36
| | | | | | | We shouldn't crash on unhandled intrinsics. Also simplify failure handling in loop. llvm-svn: 259546
* AMDGPU: Use inbounds when calculating workitem offsetMatt Arsenault2016-02-021-6/+7
| | | | | | | | | | | | | When promoting allocas to LDS, we know we are indexing into a specific area just created, and the calculation will also never overflow. Also emit some of the muls as nsw nuw, because instcombine infers this already from the range metadata. I think putting this on the other adds and muls might be OK too, but I'm not 100% sure. llvm-svn: 259545
* Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.Eugene Zelenko2016-02-022-5/+1
| | | | | | Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539
* [MC] Enable eip-relative addressing on x86-64 for X32 ABIDerek Schuff2016-02-021-1/+6
| | | | | | | | | | | | | | | | | Summary: Enables eip-based addressing, e.g., lea constant(%eip), %rax lea constant(%eip), %eax in MC, (used for the x32 ABI). EIP-base addressing is also valid in x86_64, it is left enabled for that architecture as well. Patch by João Porto Differential Revision: http://reviews.llvm.org/D16581 llvm-svn: 259528
* [AArch64] Add a FIXME comment.Chad Rosier2016-02-021-0/+2
| | | | llvm-svn: 259515
* [AArch64] Allocate the modified and used regs only once per function.Chad Rosier2016-02-021-12/+17
| | | | llvm-svn: 259510
* WebAssembly: update expected GCC torture test failuresJF Bastien2016-02-021-3/+0
| | | | | | The 3 programs used __attribute__((mode(?))) on enum, which clang r259497 fixed. llvm-svn: 259508
* Refactor backend diagnostics for unsupported featuresOliver Stannard2016-02-028-213/+37
| | | | | | | | | | | | | | | | | Re-commit of r258951 after fixing layering violation. The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. llvm-svn: 259498
* [X86][AVX512] Add support for AVX512 VMOVQ (load) shuffle decodingSimon Pilgrim2016-02-021-0/+1
| | | | llvm-svn: 259496
* WebAssembly: add option to disable register coloringJF Bastien2016-02-021-0/+7
| | | | | | Having this hidden option makes it easier to debug other issues. llvm-svn: 259482
* Removed FeatureVFPOnlySP from the Cortex-R7 processor modelSjoerd Meijer2016-02-021-1/+0
| | | | | | | | | description and changed the regression test accordingly. The default configuration of a Cortex-R7 is to implement the VFPv3-D16 architecture and the feature line as it was is too restrictive. llvm-svn: 259480
* [X86] Fix a bug in getMemOpBaseRegImmOfsSanjoy Das2016-02-021-1/+5
| | | | | | | | | Fix a crash in `getMemOpBaseRegImmOfs` that happens if the base of `MemOp` is a frame index memory operand. The fix is to have `getMemOpBaseRegImmOfs` bail out in such cases. We can possibly be more clever here, if needed. llvm-svn: 259456
* [X86][FastISel] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR.Ahmed Bougacha2016-02-021-2/+4
| | | | | | FastISel counterpart to r259448. llvm-svn: 259449
* [X86] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR.Ahmed Bougacha2016-02-021-2/+7
| | | | | | | | | | | | | | | | | | Officially, we don't acknowledge non-default configurations of MXCSR, as getting there would require usage of the FENV_ACCESS pragma (at least insofar as rounding mode is concerned). We don't support the pragma, so we can assume that the default rounding mode - round to nearest, ties to even - is always used. However, it's inconsistent with the rest of the instruction set, where MXCSR is always effective (unless otherwise specified). Also, it's an unnecessary obstacle to the few brave souls that use fenv.h with LLVM. Avoid the hard-coded rounding mode for fp_to_f16; use MXCSR instead. llvm-svn: 259448
* fix typos; NFCSanjay Patel2016-02-011-2/+2
| | | | llvm-svn: 259438
* [X86][AVX512] Add support for AVX512 VMOVD (load) shuffle decodingSimon Pilgrim2016-02-011-0/+1
| | | | llvm-svn: 259430
* [X86][AVX512] Add support for AVX512 VMOVSD/VMOVSS shuffle decodingSimon Pilgrim2016-02-011-0/+4
| | | | llvm-svn: 259427
* [X86][AVX512] Add support for AVX512 VINSERTPS shuffle decodingSimon Pilgrim2016-02-011-0/+2
| | | | llvm-svn: 259420
* SmallSet/SmallPtrSet: Refuse huge Small numbersMatthias Braun2016-02-011-2/+2
| | | | | | | | | | These sets do linear searching in small mode; It is not a good idea to use huge numbers as the small value here, save people from themselves by adding a static_assert. Differential Revision: http://reviews.llvm.org/D16706 llvm-svn: 259419
* Move comments a bit closer to associated code. NFC.Chad Rosier2016-02-011-29/+25
| | | | llvm-svn: 259411
* Remove extra semicolon. NFC.Chad Rosier2016-02-011-1/+1
| | | | llvm-svn: 259402
* AArch64: Implement missed conditional compare sequences.Balaram Makam2016-02-012-2/+47
| | | | | | | | | | | | | | | | | | Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387
* [AArch64] Simplify prolog/epilog callee save/restore. NFC.Geoff Berry2016-02-011-61/+87
| | | | | | | | | | | | | | | | | Summary: Factor out common code for callee-save register pair calculation. This is intended to simplify follow-on changes that reduce the number of registers saved/restored. Depends on D16732 Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16734 llvm-svn: 259384
* [SystemZ] Fix wrong-code generation for certain always-false conditionsUlrich Weigand2016-02-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | We've found another bug in the code generation logic conditions for a certain class of always-false conditions, those of the form if ((a & 1) < 0) These only reach the back end when compiling without optimization. The bug was introduced by the choice of using TEST UNDER MASK to implement a check for if ((a & MASK) < VAL) as if ((a & MASK) == 0) where VAL is less than the the lowest bit of MASK. This is correct in all cases except for VAL == 0, in which case the original condition is always false, but the replacement isn't. Fixed by excluding that particular case. llvm-svn: 259381
* [NFC] Referencing manual for reason why subregbit is checkedColin LeMahieu2016-02-011-1/+2
| | | | llvm-svn: 259380
* [AArch64] Simplify callee-save register save/restore. NFC.Geoff Berry2016-02-012-68/+23
| | | | | | | | | | | | | | | | | | | Summary: Simplify callee-save register save/restore code generation by remembering the size of the callee-save area when it is computed so we don't have to scan the prologue/epilogue instructions again later to reconstruct it. This is intended to simplify follow-on changes that reduce the number of registers saved/restored. Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16732 llvm-svn: 259365
* [X86][AVX512VBMI] add encoding and intrinsics for MultishiftAsaf Badouh2016-02-015-17/+39
| | | | | | Differential Revision: http://reviews.llvm.org/D16399 llvm-svn: 259363
* [mips] Range check uimm16 and fix several bugs this revealed.Daniel Sanders2016-02-017-56/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The bugs were: * teq and similar take 4-bit unsigned immediates on microMIPS. * teqi and similar have side-effects like teq do. * shll_s.w and shra_r.w take 5-bit unsigned immediates. * The various DSP ext* instructions take a 5-bit immediate. * repl.qh takes an 8-bit unsigned immediate. * repl.ph takes a 10-bit unsigned immediate. * rddsp/wrdsp take a 10-bit unsigned immediate. * teqi and similar take signed 16-bit immediates (10-bit for microMIPS). * Out-of-range immediate macros for or/xor take a simm32/simm64 depending on architecture. I'll fix the simm64 case properly when I reach simm32. lui is a bit more lenient than GAS and accepts signed immediates in addition to unsigned. This is because MipsMCExpr can produce signed values when constant folding and it currently lacks a way of knowing it should fold to an unsigned value. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15446 llvm-svn: 259360
* WebAssembly NFC: simplify control flowJF Bastien2016-02-011-43/+63
| | | | | | This should now be easier to read. llvm-svn: 259349
* AVX512: fix mask handling for gather/scatter/prefetch intrinsics.Igor Breger2016-02-011-39/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D16755 llvm-svn: 259346
* [X86][SSE] Find source of the inserted element of INSERTPSSimon Pilgrim2016-02-011-4/+29
| | | | | | | | Minor patch to trace back through target shuffles to the source of the inserted element in a (V)INSERTPS shuffle. Differential Revision: http://reviews.llvm.org/D16652 llvm-svn: 259343
* AVX512 : Fix SETCCE lowering for KNL 32 bit.Igor Breger2016-02-011-2/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D16752 llvm-svn: 259342
* [X86] Cleanup the WinEHState passDavid Majnemer2016-02-011-25/+14
| | | | | | | | Remove unnecessary includes and class state. No functional change intended. llvm-svn: 259340
* Replace usages of llvm::utostr_32 with just llvm::utostr. While this is less ↵Craig Topper2016-01-311-20/+20
| | | | | | efficient, its unclear the few places that were using the _32 version were doing so for efficiency. llvm-svn: 259330
* WebAssembly: more failures are goneJF Bastien2016-01-311-24/+0
| | | | llvm-svn: 259321
OpenPOWER on IntegriCloud