summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [Hexagon] Generate store-immediate instructions for stack objectsKrzysztof Parzyszek2017-06-132-4/+25
| | | | | | | | | Store-immediate instructions have a non-extendable offset. Since the actual offset for a stack object is not known until much later, only generate these stores when the stack size (at the time of instruction selection) is small. llvm-svn: 305305
* [Hexagon] Generate multiply-high instruction in iselKrzysztof Parzyszek2017-06-131-0/+5
| | | | llvm-svn: 305302
* bpf: clang-format on BPFAsmPrinter.cppYonghong Song2017-06-131-2/+3
| | | | | Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305301
* [Hexagon] Don't kill live registers when creating mux out of tfrKrzysztof Parzyszek2017-06-132-8/+22
| | | | | | | | | | | When a mux instruction is created from a pair of complementary conditional transfers, it can be placed at the location of either the earlier or the later of the transfers. Since it will use the operands of the original transfers, putting it in the earlier location may hoist a kill of a source register that was originally further down. Make sure the kill flag is removed if the register is still used afterwards. llvm-svn: 305300
* [MIPS] BuildCondBr should preserve MO flagsSimon Dardis2017-06-131-6/+3
| | | | | | | | | | | | | | | | | While simplifying branches in the MachineInstr representation, the routine BuildCondBr must preserve flags on register MachineOperands. In particular, it must preserve the <undef> flag. This fixes a bug that is unlikely to occur in any real scenario, but which bugpoint is likely to introduce. Patch By Nick Johnson! Reviewers: ahatanak, sdardis Differential Revision: https://reviews.llvm.org/D34041 llvm-svn: 305290
* [Hexagon] Stop pmpy recognition when shift conversion failsKrzysztof Parzyszek2017-06-131-1/+2
| | | | | | | The conversion of shifts from right shifts to left shifts may fail. In such case, the pmpy recognition cannot proceed. llvm-svn: 305289
* [ARM] Add scheduling classes for VFNM[AS]Oliver Stannard2017-06-131-6/+12
| | | | | | | | | | The VFNM[AS] instructions did not have scheduling information attached, which was causing assertion failures with the Cortex-A57 scheduling model and -fp-contract=fast, because the Cortex-A57 sched model claims to be complete. Differential Revision: https://reviews.llvm.org/D34139 llvm-svn: 305288
* Strip UTF8 BOM that got added in rL305091Simon Pilgrim2017-06-131-1/+0
| | | | | | Seems my recent move to VS2017 has resulted in a few text editor issues..... llvm-svn: 305285
* [X86][SSE] Refactor getTargetConstantBitsFromNode to avoid large APInts ↵Simon Pilgrim2017-06-131-36/+66
| | | | | | | | | | (PR32037) Much of PR32037's compile time regression is due to getTargetConstantBitsFromNode always creating large (>64bit) APInts during the bitcasting from the source data to the destination bitwidth. This commit avoids this bitcast stage if the data is already the correct bitwidth. llvm-svn: 305284
* PPCISelLowering.cpp: Fix warnings in r305214. [-Wdocumentation]NAKAMURA Takumi2017-06-131-3/+3
| | | | llvm-svn: 305277
* [AVX-512] Mark masked VPCMP instructions as commutable.Craig Topper2017-06-132-14/+27
| | | | llvm-svn: 305276
* [AVX-512] Mark masked version of vpcmpeq as being commutable.Craig Topper2017-06-131-0/+1
| | | | llvm-svn: 305275
* [X86] Add masked integer compare instructions to load folding tables.Craig Topper2017-06-131-0/+58
| | | | llvm-svn: 305274
* [WebAssembly] Fix symbol type for addresses of external functionsSam Clegg2017-06-131-2/+8
| | | | | | | | | | | | | | These symbols were previously not being marked as functions so were appearing as globals instead, and with the incorrect relocation type. Without this fix, objects that take address of external functions include them as global imports rather than function imports which then fails at link time. Differential Revision: https://reviews.llvm.org/D34068 llvm-svn: 305263
* AMDGPU/GlobalISel: Mark 32-bit G_ADD as legalTom Stellard2017-06-121-0/+2
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33992 llvm-svn: 305232
* AArch64: don't try to emit an add (shifted reg) for SP.Tim Northover2017-06-121-0/+8
| | | | | | | | | | The "Add/sub (shifted reg)" instructions use the 31 encoding for xzr and wzr rather than the SP, so we need to use different variants. Situations where this actually comes up are rare enough (see test-case) that I think falling back to DAG is fine. llvm-svn: 305230
* [PowerPC] Match vec_revb builtins to P9 instructions.Tony Jiang2017-06-124-7/+105
| | | | | | | | | | | | Power9 has instructions that will reverse the bytes within an element for all sizes (half-word, word, double-word and quad-word). These can be used for the vec_revb builtins in altivec.h. However, we implement these to match vector shuffle nodes as that will cover both the builtins and vector shuffles that occur in the SDAG through other means. Differential Revision: https://reviews.llvm.org/D33690 llvm-svn: 305214
* [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions.Tony Jiang2017-06-124-5/+51
| | | | | | | | | | | Note that if we need the result of both the divide and the modulo then we compute the modulo based on the result of the divide and not using the new hardware instruction. Commit on behalf of STEFAN PINTILIE. Differential Revision: https://reviews.llvm.org/D33940 llvm-svn: 305210
* AMDGPU: Don't add same implicit use multiple timesMatt Arsenault2017-06-121-4/+2
| | | | | | | For the last component, the same register use was added as an implicit use and another implicit kill use. llvm-svn: 305205
* AMDGPU: Teach isLegalAddressingMode about flat offsetsMatt Arsenault2017-06-121-3/+11
| | | | | | | Also fix reporting r+r as a valid addressing mode without offsets. llvm-svn: 305203
* AMDGPU: Start selecting flat instruction offsetsMatt Arsenault2017-06-122-18/+42
| | | | llvm-svn: 305201
* AMDGPU: Verify that flat offsets aren't used pre-GFX9Matt Arsenault2017-06-121-2/+11
| | | | | | | For convenience the operand is always present in the instruction, but it isn't valid to use except on GFX9. llvm-svn: 305200
* [Falkor] Enable SW Prefetch.Haicheng Wu2017-06-121-0/+4
| | | | | | | | SW prefetch is good for Falkor. Differential Revision: http://reviews.llvm.org/D34084 llvm-svn: 305199
* AMDGPU: Start adding offset fields to flat instructionsMatt Arsenault2017-06-125-25/+94
| | | | llvm-svn: 305194
* [DAG] add helper to bind memop chains; NFCISanjay Patel2017-06-122-34/+4
| | | | | | | | | | This step is just intended to reduce code duplication rather than change any functionality. A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper. Differential Revision: https://reviews.llvm.org/D33649 llvm-svn: 305192
* Const correctness for TTI::getRegisterBitWidthDaniel Neilson2017-06-1210-10/+10
| | | | | | | | | | | | | | Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189
* [X86][SSE] Change memop fragment to inherit from vec128load with local ↵Simon Pilgrim2017-06-121-8/+4
| | | | | | | | | | | | alignment controls First possible step towards merging SSE/AVX memory folding pattern fragments. Also allows us to remove the duplicate non-temporal load logic. Differential Revision: https://reviews.llvm.org/D33902 llvm-svn: 305184
* [AVX-512] Add VPCONFLICT and VPLZCNT to load folding tables.Craig Topper2017-06-121-0/+36
| | | | llvm-svn: 305180
* [x86] use vperm2f128 rather than vinsertf128 when there's a chance to fold a ↵Sanjay Patel2017-06-111-9/+13
| | | | | | | | | | | | | | | | | | | | | 32-byte load I was looking closer at the x86 test diffs in D33866, and the first change seems like it shouldn't happen in the first place. So this patch will resolve that. Using Agner's tables and AMD docs, vperm2f128 and vinsertf128 have identical timing for any given CPU model, so we should be able to interchange those without affecting perf. But as we can see in some of the diffs here, using vperm2f128 allows load folding, so we should take that opportunity to reduce code size and register pressure. A secondary advantage is making AVX1 and AVX2 codegen more similar. Given that vperm2f128 was introduced with AVX1, we should be selecting it in all of the same situations that we would with AVX2. If there's some reason that an AVX1 CPU would not want to use this instruction, that should be fixed up in a later pass. Differential Revision: https://reviews.llvm.org/D33938 llvm-svn: 305171
* AMDGPU : Fix ISA Version Definitions.Wei Ding2017-06-104-27/+99
| | | | | | Differential Revision: http://reviews.llvm.org/D28531 llvm-svn: 305137
* [AArch64] Add fallback in FastISel fp16 conversionsI-Jui (Ray) Sung2017-06-091-1/+5
| | | | | | | | | | | | | | | | | Summary: - Fix assertion failures on F16 to/from int types in FastISel by falling back to regular ISel - Add a testcase of various conversion cases with FastISel (-O0) Reviewers: kristof.beyls, jmolloy, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D33734 llvm-svn: 305127
* [AMDGPU] Add intrinsics for alignbit and alignbyte instructionsStanislav Mekhanoshin2017-06-091-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D34046 llvm-svn: 305098
* [X86][SSE] Add support for PACKSS nodes to faux shuffle extractionSimon Pilgrim2017-06-091-6/+32
| | | | | | If the inputs won't saturate during packing then we can treat the PACKSS as a truncation shuffle llvm-svn: 305091
* [Hexagon] Fixes and updates to the selection patternsKrzysztof Parzyszek2017-06-091-28/+52
| | | | | | | | - Add some missing patterns. - Use C4_cmplte in branch patterns. - Fix signedness of immediate operand in M2_accii. llvm-svn: 305085
* Reland "[SelectionDAG] Enable target specific vector scalarization of calls ↵Simon Dardis2017-06-096-15/+194
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and returns" By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. The previous version of this patch had a "conditional move or jump depends on uninitialized value". Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 305083
* [AMDGPU] Fix for issue in alloca to vector promotion passDavid Stuttard2017-06-091-6/+12
| | | | | | | | | | | | | | | Summary: Alloca promotion pass not dealing with non-canonical input Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical) Also added some test cases in non-canonical form to check that it no longer crashes Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31710 llvm-svn: 305079
* [ARM] Custom machine-scheduler. NFCI.Javed Absar2017-06-091-0/+15
| | | | | | | | | This patch creates a customised machine-scheduler for ARM targets, so that subsequently DAG mutations etc can be added. Reviewed by: hahn, rengolin, rovka. Differential Revision: https://reviews.llvm.org/D34039 llvm-svn: 305078
* [Hexagon] Add LLVM header to HexagonPatterns.tdKrzysztof Parzyszek2017-06-091-0/+9
| | | | llvm-svn: 305074
* [ARM] Add scheduling info for VFMSOliver Stannard2017-06-091-3/+6
| | | | | | | | | | The scalar VFMS instructions did not have scheduling information attached (but VFMA did), which was causing assertion failures with the Cortex-A57 scheduling model and -fp-contract=fast. Differential Revision: https://reviews.llvm.org/D34040 llvm-svn: 305064
* Test commit: remove whitespaceStefan Maksimovic2017-06-091-1/+1
| | | | llvm-svn: 305059
* Fix -Wunused-variable.Rui Ueyama2017-06-091-2/+0
| | | | llvm-svn: 305051
* [Hexagon] Re-enable machine verifier after codegen passesKrzysztof Parzyszek2017-06-081-17/+17
| | | | | | | Remove "false" from the arguments to "addPass" in Hexagon's target pass config. llvm-svn: 305015
* [Hexagon] Skip mux generation when predicate register is undefinedKrzysztof Parzyszek2017-06-081-1/+4
| | | | llvm-svn: 305014
* AMDGPU: Work around build special casing .inc filesMatt Arsenault2017-06-083-1/+7
| | | | | | | It complains because it assumes these were autogenerated files in the source directory. llvm-svn: 305005
* AMDGPU: Use correct register names in inline assemblyMatt Arsenault2017-06-083-0/+410
| | | | | | Fixes using physical registers in inline asm from clang. llvm-svn: 305004
* [Hexagon] Speedup NumNodesBlocking calculation. NFCI.Nirav Dave2017-06-081-32/+25
| | | | llvm-svn: 305003
* [PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64Guozhi Wei2017-06-081-12/+26
| | | | | | | | | | In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64. This patch fixed PR32442. Differential Revision: https://reviews.llvm.org/D31407 llvm-svn: 305001
* [AMDGPU] Force qsads instrs to use different dest register than source registersMark Searles2017-06-081-0/+5
| | | | | | | | The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes. Differential Revision: https://reviews.llvm.org/D33783 llvm-svn: 304998
* [Power9] Exploit vector integer extend instructionsZaara Syeda2017-06-081-0/+51
| | | | | | | | | | | | | | This patch adds build vector patterns to exploit the vector integer extend instructions: vextsb2w - Vector Extend Sign Byte To Word vextsb2d - Vector Extend Sign Byte To Doubleword vextsh2w - Vector Extend Sign Halfword To Word vextsh2d - Vector Extend Sign Halfword To Doubleword vextsw2d - Vector Extend Sign Word To Doubleword Differential Revision: https://reviews.llvm.org/D33510 llvm-svn: 304992
* Add scheduler classes to integer/float horizontal operations.Andrew V. Tischenko2017-06-086-5/+126
| | | | | | | This patch will close PR32801. Differential Revision: https://reviews.llvm.org/D33203 llvm-svn: 304986
OpenPOWER on IntegriCloud