summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Enable printing instructions using aliasesHal Finkel2015-04-232-2/+8
| | | | | | | | | | | TableGen had been nicely generating code to print a number of instructions using shorter aliases (and PowerPC has plenty of short mnemonics), but we were not calling it. For some of the aliases we support in the parser, TableGen can't infer the "inverse" alias relationship, so there is still more to do. Thus, after some hours of updating test cases... llvm-svn: 235616
* [AArch64] Add nvcast patterns for v4f16 and v8f16Pirama Arumuga Nainar2015-04-231-0/+8
| | | | | | | | | | | | | | | | | | Summary: Constant stores of f16 vectors can create NvCast nodes from various operand types to v4f16 or v8f16 depending on patterns in the stored constants. This patch adds nvcast rules with v4f16 and v8f16 values. AArchISelLowering::LowerBUILD_VECTOR has the details on which constant patterns generate the nvcast nodes. Reviewers: jmolloy, srhines, ab Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9201 llvm-svn: 235610
* [AArch64] Handle vec4, vec8, vec16 *itofp for halfPirama Arumuga Nainar2015-04-231-0/+10
| | | | | | | | | | | | | | | | | | | | | | Summary: Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32, v8i8, v8i16 inputs to allow promotion of v4f16 results. Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32, and i64 vectors. Only missing tests are for v16i8 and v16i16 as the shift operations are too complicated to write a proper check sequence. The conversions from v4i64 to v4f16 do not depend on this patch - v4i64 is split and the conversion gets handled while lowering v2i64. I am adding a test here for completeness. Reviewers: aemerson, rengolin, ab, jmolloy, srhines Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9166 llvm-svn: 235609
* Re-commit r235560: Switch lowering: extract jump tables and bit tests before ↵Hans Wennborg2015-04-231-5/+6
| | | | | | | | | | | building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608
* [Hexagon] Shrink-wrap stack frame (Hexagon-specific)Krzysztof Parzyszek2015-04-233-386/+562
| | | | llvm-svn: 235603
* [mips] [IAS] Move NOP emission after pseudo-instruction expansion. NFC.Toma Tabacu2015-04-231-11/+9
| | | | | | As suggested in the review for http://reviews.llvm.org/D8537. llvm-svn: 235601
* Revert r235560; this commit was causing several failed assertions in Debug ↵Aaron Ballman2015-04-231-6/+5
| | | | | | builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597
* Switch lowering: extract jump tables and bit tests before building binary ↵Hans Wennborg2015-04-221-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560
* [Hexagon] Some cleanup of instruction selection codeKrzysztof Parzyszek2015-04-225-800/+709
| | | | llvm-svn: 235552
* [Hexagon] Use A2_tfrsi for constant pool and jump table addressesKrzysztof Parzyszek2015-04-225-78/+151
| | | | llvm-svn: 235535
* [AArch64] Use MachineRegisterInfo instead of LiveIntervals to calculate ↵Pete Cooper2015-04-221-4/+4
| | | | | | | | | | | | liveness. NFC. The CondOpt pass currently uses LiveIntervals to set the dead flag on a def. This patch uses MachineRegisterInfo::use_empty instead as that is equivalent to the def being dead. This removes an instance of LiveIntervals in the pass manager pipeline and saves 3.8% of compile time on llc conpiled for AArch64. Reviewed by Chad Rosier and Zhaoshi. llvm-svn: 235532
* [Hexagon] Consider constant-extended offsets to be validKrzysztof Parzyszek2015-04-222-10/+15
| | | | llvm-svn: 235529
* Fix Windows build break: use LLVM_FUNCTION_NAME instead of __func__.Krzysztof Parzyszek2015-04-222-2/+2
| | | | llvm-svn: 235525
* R600: Fix always inline pass breaking noinline functionsMatt Arsenault2015-04-221-2/+3
| | | | | | No test since calls are not actually supported yet. llvm-svn: 235524
* [Hexagon] Overhaul of stack object allocationKrzysztof Parzyszek2015-04-2210-447/+1328
| | | | | | | | - Use static allocation for aligned stack objects. - Simplify dynamic stack object allocation. - Simplify elimination of frame-indices. llvm-svn: 235521
* [x86] Add store-folded memop patterns for vcvtps2phSanjay Patel2015-04-221-0/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D7296 llvm-svn: 235517
* [Hexagon] Treat CFI as solo instructionsKrzysztof Parzyszek2015-04-221-3/+5
| | | | llvm-svn: 235516
* [Hexagon] Implement HexagonInstPrinter::printRegNameKrzysztof Parzyszek2015-04-222-6/+7
| | | | llvm-svn: 235514
* [X86][AVX] Fix failure due to a missing ISel pattern to select VBROADCAST ↵Andrea Di Biagio2015-04-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | nodes (PR23259). This fixes a regression introduced at revision 218263. On AVX, if we optimize for size, a splat build_vector of a load is lowered into a VBROADCAST node. This is done even if the value type of the splat build_vector node is v2i64. Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast where the scalar element comes from a loadi64/loadf64. However, revision 218263 forgot to add an extra fallback pattern for the case where we have a X86VBroadcast of a loadi64 with multiple uses. This patch adds the missing tablegen pattern in X86InstrSSE.td. This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern to select v2i64 X86ISD::BROADCAST nodes. llvm-svn: 235509
* [mips][microMIPSr6] Implement mips32 to microMIPSr6 mapping supportZoran Jovanovic2015-04-222-3/+24
| | | | | | Differential Revision: http://reviews.llvm.org/D8661 llvm-svn: 235505
* Revert "[mips][FastISel] Implement shift ops for Mips fast-isel."Vasileios Kalintiris2015-04-221-80/+0
| | | | | | | This reverts commit r235194. It was causing a failure in FastISel buildbots due to sign-extension issues. llvm-svn: 235495
* [AArch64] Disable complex GEP optimization by default.James Molloy2015-04-221-1/+1
| | | | | | | | Enough concerns were raised that this optimization is pessimising some code patterns. The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher. llvm-svn: 235491
* [patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and theLang Hames2015-04-222-7/+31
| | | | | | | | | | | | X86 backend. The code generated for symbolic targets is identical to the code generated for constant targets, except that a relocation is emitted to fix up the actual target address at link-time. This allows IR and object files containing patchpoints to be cached across JIT-invocations where the target address may change. llvm-svn: 235483
* [x86] allow 64-bit extracted vector element integer stores on a 32-bit systemSanjay Patel2015-04-221-0/+21
| | | | | | | | | | | | | | | | | | | | | With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system even though 64-bit integers are not legal types. So instead of producing this: pshufd $229, %xmm0, %xmm1 ## xmm1 = xmm0[1,1,2,3] movd %xmm0, (%eax) movd %xmm1, 4(%eax) We can do: movq %xmm0, (%eax) This is a fix for the problem noted in D7296. Differential Revision: http://reviews.llvm.org/D9134 llvm-svn: 235460
* [Hexagon] Patterns for frame index with offset for iselKrzysztof Parzyszek2015-04-211-0/+1
| | | | llvm-svn: 235418
* [NVPTX] do not run DCE after SLSR and SeparateConstOffsetFromGEPJingyue Wu2015-04-211-10/+4
| | | | | | | | | | | | | | | | Summary: With D9096 and D9101, there's no need to run DCE after SLSR and SeparateConstOffsetFromGEP. Test Plan: no regression Reviewers: jholewinski, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9172 llvm-svn: 235415
* X86: Match for X86ISD nodes in LowerBUILD_VECTOR instead of BUILD_VECTORCombineMatthias Braun2015-04-211-24/+23
| | | | | | | | | There doesn't seem to be a reason to perform this target ISD node matching in an DAGCombine, moving it to lowering fixes PR23296. Differential Revision: http://reviews.llvm.org/D9137 llvm-svn: 235394
* AVX-512: Added VPMOVx2M instructions for SKX,Elena Demikhovsky2015-04-211-1/+30
| | | | | | fixed encoding of VPMOVM2x. llvm-svn: 235385
* AVX-512: Added VPTESTM and VPTESTNM instructions for SKXElena Demikhovsky2015-04-212-30/+127
| | | | llvm-svn: 235383
* [mips] [IAS] Implement the .asciiz directive.Toma Tabacu2015-04-211-0/+2
| | | | | | | | | | | | | | | | Summary: This directive is exactly the same as .asciz, except it's only used by MIPS. It is used to store null terminated strings in object files. Reviewers: rafael, dsanders, echristo Reviewed By: dsanders, echristo Subscribers: echristo, llvm-commits Differential Revision: http://reviews.llvm.org/D7530 llvm-svn: 235382
* [mips][microMIPSr6] Implement CACHE and PREF instructionsJozef Kolek2015-04-213-3/+32
| | | | | | | | Implement CACHE and PREF instructions using mapping. Differential Revision: http://reviews.llvm.org/D8893 llvm-svn: 235379
* [mips] Cleanup old floating-point flag conditions definitions. NFC.Vasileios Kalintiris2015-04-211-22/+0
| | | | | | | | Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D7947 llvm-svn: 235377
* [mips] Optimize code generation for 64-bit variable shift instructions.Vasileios Kalintiris2015-04-211-0/+10
| | | | | | | | | | | | | | | | Summary: The 64-bit version of the variable shift instructions uses the shift_rotate_reg class which uses a GPR32Opnd to specify the variable shift amount. With this patch we avoid the generation of a redundant SLL instruction for the variable shift instructions in 64-bit targets. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7413 llvm-svn: 235376
* AVX-512: Added logical and arithmetic instructions for SKXElena Demikhovsky2015-04-213-82/+178
| | | | | | by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 235375
* [X86][SSE] Provide execution domains for scalar floating point operationsSimon Pilgrim2015-04-211-38/+41
| | | | | | | | | | This is an updated version of Chandler's patch D7402 that got accepted but never committed, and has bit-rotted a bit since. I've updated the execution domain declarations to match the approach of the packed templates and also added some extra scalar unary tests. Differential Revision: http://reviews.llvm.org/D9095 llvm-svn: 235372
* X86: Do not select X86 custom vector nodes if operand types don't matchMatthias Braun2015-04-211-4/+16
| | | | | | | | | | | | | X86ISD::ADDSUB, X86ISD::(F)HADD, X86ISD::(F)HSUB should not be selected if the operand types do not match the result type because vector type legalization cannot deal with this for custom nodes. Testcase X86ISD::ADDSUB is attached. I could not create a testcase for the FHADD/FHSUB cases because of: https://llvm.org/bugs/show_bug.cgi?id=23296 Differential Revision: http://reviews.llvm.org/D9120 llvm-svn: 235367
* [MIPS] OperationAction for FP_TO_FP16, FP16_TO_FPPirama Arumuga Nainar2015-04-201-2/+22
| | | | | | | | | | | | | | | | | | | | | | Summary: Set operation action for FP16 conversion opcodes, so the Op legalizer can choose the gnu_* libcalls for Mips. Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to prevent (fpext (load )) and (store (fptrunc)) from getting combined into unsupported operations. Added test cases to test that these operations are handled correctly for f16 scalars and vectors. This patch depends on http://reviews.llvm.org/D8755. Reviewers: srhines Subscribers: llvm-commits, ab Differential Revision: http://reviews.llvm.org/D8804 llvm-svn: 235341
* [mips][microMIPSr6] Implement BITSWAP instructionJozef Kolek2015-04-203-2/+30
| | | | | | | | Implement BITSWAP instruction using mapping. Differential Revision: http://reviews.llvm.org/D8857 llvm-svn: 235321
* [AArch64] LORID_EL1 register must be treated as read-onlyVladimir Sukharev2015-04-201-3/+5
| | | | | | | | | | | | Patch by: John Brawn Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9105 llvm-svn: 235314
* [PowerPC] Flow oversized lines for r235309Bill Schmidt2015-04-201-11/+25
| | | | llvm-svn: 235310
* [PowerPC] Add future work for vector insert/extract to README_ALTIVEC.txtBill Schmidt2015-04-201-0/+14
| | | | llvm-svn: 235309
* [mips][microMIPSr6] Implement disassembler supportJozef Kolek2015-04-202-4/+13
| | | | | | | | Implement disassembler support for microMIPS32r6. Differential Revision: http://reviews.llvm.org/D8490 llvm-svn: 235307
* [mips][microMIPSr6] Implement BALC and BC instructionsJozef Kolek2015-04-203-3/+58
| | | | | | | | This patch implements BALC and BC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8388 llvm-svn: 235302
* [mips][microMIPSr6] Implement initial mapping supportJozef Kolek2015-04-203-2/+40
| | | | | | Differential Revision: http://reviews.llvm.org/D8387 llvm-svn: 235298
* [mips][microMIPSr6] Implement initial subtarget supportJozef Kolek2015-04-204-0/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D8386 llvm-svn: 235296
* [X86][FastIsel] Fix assertion failure when selecting int-to-double ↵Andrea Di Biagio2015-04-201-5/+6
| | | | | | | | | | | | | | | | | | | | | | conversion (PR23273). This fixes a regression introduced at revision 231243. The target-independent selection algorithm in FastISel knows how to select a SINT_TO_FP if the target is SSE but not AVX. That is because on X86, the tablegen'd 'fastEmit' functions know how to select CVTSI2SSrr and CVTSI2SDrr. Method X86FastISel::X86SelectSIToFP was therefore working under the wrong assumption that the target was AVX. That assumption was incorrect since we can have a target that is neither AVX nor SSE. So, rather than asserting for the presence of AVX, we should have had an early exit from 'X86SelectSIToFP' if the target was not AVX. This patch fixes the issue replacing the invalid assertion with an early exit. Thanks to Dimitry Andric for reporting this problem and for providing a small reproducible testcase. Added test pr23273.ll. llvm-svn: 235295
* [X86][SSE] Fix for getScalarValueForVectorElement to detect scalar sources ↵Simon Pilgrim2015-04-191-2/+7
| | | | | | | | | | requiring truncation. The fix ensures that scalar sources inserted into a vector are the correct bit size. Integer scalar sources from BUILD_VECTOR and SCALAR_TO_VECTOR nodes may require truncation that this function doesn't currently support. llvm-svn: 235281
* Remove unnecessary include and probably a layering violation.Craig Topper2015-04-191-1/+0
| | | | llvm-svn: 235262
* [AArch64] Don't force MVT::Untyped when selecting LD1LANEpost.Ahmed Bougacha2015-04-171-1/+1
| | | | | | | | | | | | | | | | | | | The result is either an Untyped reg sequence, on ldN with N > 1, or just the type of the input vector, on ld1. Don't force Untyped. Instead, just use the type of the reg sequence. This mirrors the behavior of createTuple, which feeds the LD1*_POST. The narrow code path wasn't actually covered by tests, because V64 insert_vector_elt are widened to V128 before the LD1LANEpost combine has the chance to run, usually. The only case where it does run on V64 vectors is if the vector ops legalizer ran. So, tickle the code with a ctpop. Fixes PR23265. llvm-svn: 235243
* [AArch64] Avoid vector->load dependency cycles when creating LD1*post.Ahmed Bougacha2015-04-171-0/+7
| | | | | | | | They would break the SelectionDAG. Note that the opposite load->vector dependency is already obvious in: (LD1*post vec, ..) llvm-svn: 235224
OpenPOWER on IntegriCloud