summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/SI: Do not insert EndCf in an unreachable blockChangpeng Fang2017-03-071-2/+3
| | | | | | | | | | Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D22025 llvm-svn: 297243
* Recommit: [globalisel] Change LLT constructor string into an LLT-based ↵Daniel Sanders2017-03-073-5/+6
| | | | | | | | | | | | | | | | | | | | object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297241
* [Hexagon] Check for presence before looking registers up in bit trackerKrzysztof Parzyszek2017-03-071-0/+4
| | | | llvm-svn: 297240
* [Hexagon] Generate bitsplit instructionKrzysztof Parzyszek2017-03-071-1/+118
| | | | llvm-svn: 297239
* [NVPTX] Fixed lowering of unaligned loads/stores of f16 scalars and vectors.Artem Belevich2017-03-071-11/+31
| | | | | | Differential Revision: https://reviews.llvm.org/D30672 llvm-svn: 297198
* [AArch64] Vulcan is now ThunderXT99Joel Jones2017-03-075-193/+203
| | | | | | | | | | | | | | | | | Broadcom Vulcan is now Cavium ThunderX2T99. LLVM Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32113 Minor fixes for the alignments of loops and functions for ThunderX T81/T83/T88 (better performance). Patch was tested with SpecCPU2006. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D30510 llvm-svn: 297190
* Revert r297177: Change LLT constructor string into an LLT-based object ...Daniel Sanders2017-03-073-6/+5
| | | | | | | | | | More module problems. This time it only showed up in the stage 2 compile of clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile. Somehow, this change causes the build to need Attributes.gen before it's been generated. llvm-svn: 297188
* [X86] Add option to specify preferable loop alignmentSanjoy Das2017-03-071-1/+9
| | | | | | | | | | | | | | | | | | | | | Summary: Loop alignment can cause a significant change of the perfromance for short loops. To be able to evaluate the impact of loop alignment this change introduces the new option x86-experimental-pref-loop-alignment. The alignment will be 2^Value bytes, the default value is 4. Patch by Serguei Katkov! Reviewers: craig.topper Reviewed By: craig.topper Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D30391 llvm-svn: 297178
* [globalisel] Change LLT constructor string into an LLT-based object that ↵Daniel Sanders2017-03-073-5/+6
| | | | | | | | | | | | | | | | | | knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297177
* [ARM] Correct handling of LSL #0 in an IT blockJohn Brawn2017-03-071-1/+1
| | | | | | | | | | | The check for LSL #0 in an IT block was checking if operand 4 was zero, but operand 4 is the condition code operand so it was actually checking for LSLEQ. Fix this by checking operand 3, which really is the immediate operand, and add some tests. Differential Revision: https://reviews.llvm.org/D30692 llvm-svn: 297142
* [Hexagon] Do not insert instructions before PHI nodesKrzysztof Parzyszek2017-03-071-1/+3
| | | | llvm-svn: 297141
* [ARM] Reapply r296865 "[ARM] fpscr read/write intrinsics not aware of each ↵Ranjeet Singh2017-03-071-3/+4
| | | | | | | | | | | | | | | | | | | | other"" The original patch r296865 was reverted as it broke the chromium builds for Android https://bugs.llvm.org/show_bug.cgi?id=32134, this patch reapplies r296865 with a fix to make sure it doesn't cause the build regression. The problem was that intrinsic selection on int_arm_get_fpscr was failing in ISel this was because the code to manually select this intrinsic still thought it was the version with no side-effects (INTRINSIC_WO_CHAIN) which is wrong as it doesn't semantically match the definition in the tablegen code which says it does have side-effects, I've fixed this by updating the intrinsic type to INTRINSIC_W_CHAIN (has side-effects). I've also added a test for this based on Hans original reproducer. Differential Revision: https://reviews.llvm.org/D30645 llvm-svn: 297137
* [SystemZ] Add check VT.isSimple() in canTreateAsByteVector()Jonas Paulsson2017-03-071-1/+1
| | | | | | | | Since BB-vectorizer can produce vectors of for example 3 elements, this check is needed. Review: Ulrich Weigand llvm-svn: 297136
* In Thumb1, materialize a move between low registers as a `movs`, if CPSR ↵Artyom Skrobov2017-03-071-5/+11
| | | | | | | | | | | | | | | | isn't live. Summary: Previously, it had always been materialized as a push/pop sequence. Reviewers: labrinea, jroelofs Reviewed By: jroelofs Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30648 llvm-svn: 297134
* [X86][AVX512] Adding new LLVM TableGen backend which generates the EVEX2VEX ↵Ayman Musa2017-03-073-1193/+8
| | | | | | | | | | | | compressing tables. X86EvexToVex machine instruction pass compresses EVEX encoded instructions by replacing them with their identical VEX encoded instructions when possible. It uses manually supported 2 large tables that map the EVEX instructions to their VEX ideticals. This TableGen backend replaces the tables by automatically generating them. Differential Revision: https://reviews.llvm.org/D30451 llvm-svn: 297127
* [X86][AVX512] Add missing entries to EVEX2VEX tablesAyman Musa2017-03-071-28/+61
| | | | | | | | evex2vex pass defines 2 tables which maps EVEX instructions to their VEX identical when possible. Adding all missing entries. Differential Revision: https://reviews.llvm.org/D30501 llvm-svn: 297126
* Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce ↵Tim Shen2017-03-071-43/+5
| | | | | | | | | | | stack frame size" This reverts commit r296771. We found some wide spread test failures internally. I'm working on a testcase. Politely revert the patch in the mean time. :) llvm-svn: 297124
* Revert "AMDGPU: Set MCAsmInfo::PointerSize"Konstantin Zhuravlyov2017-03-071-1/+0
| | | | | | | | It breaks line tables because the patch is not complete, working on a complete one at the moment This reverts commit r294031. llvm-svn: 297118
* GlobalISel: restrict G_EXTRACT instruction to just one operand.Tim Northover2017-03-067-50/+56
| | | | | | | A bit more painful than G_INSERT because it was more widely used, but this should simplify the handling of extract operations in most locations. llvm-svn: 297100
* [Outliner] Fixed Asan bot failure in r296418Jessica Paquette2017-03-062-0/+99
| | | | | | | | Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector is no longer initialized using reserve but just a standard constructor. llvm-svn: 297081
* [AArch64][Redundant Copy Elim] Add support for CMN and shifted imm.Chad Rosier2017-03-061-6/+13
| | | | | | | | | | This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle CMN instructions as well as a shifted immediates. Differential Revision: https://reviews.llvm.org/D30576. llvm-svn: 297078
* AMDGPU/R600: Fix ALU clause markers use detectionJan Vesely2017-03-061-2/+5
| | | | | | | | also exit early on kill instead of redefinition. Differential Revision: https://reviews.llvm.org/D30230 llvm-svn: 297060
* [X86] Fix arg copy elision for illegal typesReid Kleckner2017-03-061-37/+33
| | | | | | | | | | | Use the store size of the argument type, which will be a byte-sized quantity, rather than dividing the size in bits by 8. Fixes PR32136 and re-enables copy elision from i64 arguments. Reverts the workaround in from r296950. llvm-svn: 297045
* [Hexagon] Early-if-convert branches that may exit the loopKrzysztof Parzyszek2017-03-061-63/+106
| | | | | | | | | | | | | | | | Merge the tail block into the loop in cases where the main loop body exits early, subject to profitability constraints. This will coalesce the loop body into fewer blocks. For example: loop: loop: // loop body // loop body if (...) jump exit --> // more body more: if (...) jump exit // more body jump loop jump loop llvm-svn: 297033
* [Hexagon] Mark dead defs as <dead> in expand-condsetsKrzysztof Parzyszek2017-03-061-12/+28
| | | | | | | | | The code in updateDeadFlags removed unnecessary <dead> flags, but there can be cases where such a flag is not set, and yet a register has become dead. For example, if a mux with identical inputs is replaced with a COPY, the predicate register may no longer be used after that. llvm-svn: 297032
* [Hexagon] Pick a dot-old instruction that matches the architectureKrzysztof Parzyszek2017-03-063-4/+25
| | | | llvm-svn: 297031
* [PowerPC] Fix failure with STBRX when store is narrower than the bswapNemanja Ivanovic2017-03-061-2/+5
| | | | | | | | | | | Fixes a crash caused by r296811 by truncating the input of the STBRX node when the bswap is wider than i32. Fixes https://bugs.llvm.org/show_bug.cgi?id=32140 Differential Revision: https://reviews.llvm.org/D30615 llvm-svn: 297001
* [X86] Silence GCC enum compare warning.Benjamin Kramer2017-03-051-2/+2
| | | | | | | | X86ISelLowering.cpp:26506:36: error: enumeral mismatch in conditional expression: 'llvm::X86ISD::NodeType' vs 'llvm::ISD::NodeType' [-Werror=enum-compare] llvm-svn: 296986
* [X86][SSE] Lower 128-bit vectors to SIGN/ZERO_EXTEND_VECTOR_IN_REG opsSimon Pilgrim2017-03-053-93/+120
| | | | | | | | | | | | As described on PR31712, we miss a variety of legalization combines because we lower these to X86ISD::VSEXT/VZEXT despite them having the same functionality. This patch makes 128-bit (SSE41) SIGN/ZERO_EXTEND_VECTOR_IN_REG ops legal, adds the necessary tablegen plumbing and uses a helper 'getExtendInVec' to decide when to use SIGN/ZERO_EXTEND_VECTOR_IN_REG or VSEXT/VZEXT. We're missing a couple of shuffle combines that will be added in a future patch for review. Later patches can then support the AVX2 cases as a mixture of SIGN/ZERO_EXTEND and SIGN/ZERO_EXTEND_VECTOR_IN_REG, and then finally deal with the AVX512 cases. Differential Revision: https://reviews.llvm.org/D30549 llvm-svn: 296985
* [x86] don't require a zext when forming ADC/SBBSanjay Patel2017-03-041-24/+29
| | | | | | | | | | | | | The larger goal is to move the ADC/SBB transforms currently in combineX86SetCC() to combineAddOrSubToADCOrSBB() because we're creating ADC/SBB in lots of places where we shouldn't. This was intended to be an NFC change, but avx-512 has something strange going on. It doesn't seem like any of the affected tests should really be using SET+TEST or ADC; a simple ADD could replace several instructions. But that's another bug... llvm-svn: 296978
* [DAGCombiner] allow transforming (select Cond, C +/- 1, C) to (add(ext Cond), C)Sanjay Patel2017-03-042-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | select Cond, C +/- 1, C --> add(ext Cond), C -- with a target hook. This is part of the ongoing process to obsolete D24480. The motivation is to canonicalize to select IR in InstCombine whenever possible, so we need to have a way to undo that easily in codegen. PowerPC is an obvious winner for this kind of transform because it has fast and complete bit-twiddling abilities but generally lousy conditional execution perf (although this might have changed in recent implementations). x86 also sees some wins, but the effect is limited because these transforms already mostly exist in its target-specific combineSelectOfTwoConstants(). The fact that we see any x86 changes just shows that that code is a mess of special-case holes. We may be able to remove some of that logic now. My guess is that other targets will want to enable this hook for most cases. The likely follow-ups would be to add value type and/or the constants themselves as parameters for the hook. As the tests in select_const.ll show, we can transform any select-of-constants to math/logic, but the general transform for any 2 constants needs one more instruction (multiply or 'and'). ARM is one target that I think may not want this for most cases. I see infinite loops there because it wants to use selects to enable conditionally executed instructions. Differential Revision: https://reviews.llvm.org/D30537 llvm-svn: 296977
* [X86][SSE] Enable post-legalize vXi64 shuffle combining on 32-bit targetsSimon Pilgrim2017-03-041-5/+0
| | | | | | | | Long ago (2010 according to svn blame), combineShuffle probably needed to prevent the accidental creation of illegal i64 types but there doesn't appear to be any combines that can cause this any more as they all have their own legality checks. Differential Revision: https://reviews.llvm.org/D30213 llvm-svn: 296966
* X86ISelLowering: Only perform copy elision on legal types.Matthias Braun2017-03-041-33/+37
| | | | | | | | | This fixes cases where i1 types were not properly legalized yet and lead to the creating of 0-sized stack slots. This fixes http://llvm.org/PR32136 llvm-svn: 296950
* [x86] check for commuted add pattern to find ADC/SBBSanjay Patel2017-03-041-4/+11
| | | | llvm-svn: 296933
* GlobalISel: constrain G_INSERT to inserting just one value per instruction.Tim Northover2017-03-031-3/+2
| | | | | | | It's much easier to reason about single-value inserts and no-one was actually using the variadic variants before. llvm-svn: 296923
* [x86] refactor combineAddOrSubToADCOrSBB(); NFCISanjay Patel2017-03-031-21/+25
| | | | | | | | | | | | The comments were wrong, and this is not an obvious transform. This hopefully makes it clearer that we're missing the commuted patterns for adds. It's less clear that this is actually a good transform for all micro-arch. This is prep work for trying to clean up the current adc/sbb codegen because it's definitely not happening optimally. llvm-svn: 296918
* Make TargetInstrInfo::isPredicable take a const reference, NFCKrzysztof Parzyszek2017-03-0312-16/+16
| | | | llvm-svn: 296901
* [x86] clean up materializeSBB(); NFCISanjay Patel2017-03-031-20/+14
| | | | | | This is producing SBB where it is obviously not necessary, so it needs to be limited. llvm-svn: 296894
* [x86] fix formatting; NFCSanjay Patel2017-03-031-3/+2
| | | | llvm-svn: 296875
* Use APInt::getHighBitsSet instead of APInt::getBitsSet for upper bit mask ↵Simon Pilgrim2017-03-031-1/+1
| | | | | | creation llvm-svn: 296874
* [AMDGPU][MC] Fix for Bug 30829 + LIT testsDmitry Preobrazhensky2017-03-037-0/+163
| | | | | | | | Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction). Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code). Added LIT tests. llvm-svn: 296873
* [SDAG] Revert r296476 (and r296486, r296668, r296690).Chandler Carruth2017-03-032-6/+1
| | | | | | | | | | This patch causes compile times for some patterns to explode. I have a (large, unreduced) test case that slows down by more than 20x and several test cases slow down by 2x. I'm sending some of the test cases directly to Nirav and following up with more details in the review log, but this should unblock anyone else hitting this. llvm-svn: 296862
* [X86] Generate VZEROUPPER for Skylake-avx512.Amjad Aboud2017-03-034-65/+77
| | | | | | | | VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be. Differential Revision: https://reviews.llvm.org/D29874 llvm-svn: 296859
* [AArch64AsmParser] rewrite of function parseSysAliasSjoerd Meijer2017-03-032-212/+62
| | | | | | | | | | | | This is a cleanup/rewrite of the parseSysAlias function. It was not using the tablegen instruction descriptions, but was “manually” matching the mnemonics and recreating the operands whereas all this information is already in tablegen; all this code has been replaced with calls to lookupXYZByName tablegen calls. Differential Revision: https://reviews.llvm.org/D30491 llvm-svn: 296857
* [GlobalISel][X86] Support float/double and vector types.Igor Breger2017-03-037-32/+306
| | | | | | | | | | | | | | Summary: [GlobalISel][X86] Add support for f32/f64 and vector types in RegisterBank and InstructionSelector. Reviewers: delena, zvi Reviewed By: zvi Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30533 llvm-svn: 296856
* AMDGPU: Fix missing dominator tree dependencyMatt Arsenault2017-03-021-0/+1
| | | | llvm-svn: 296842
* [Hexagon] Pick the right branch opcode depending on branch probabilitiesKrzysztof Parzyszek2017-03-021-15/+69
| | | | | | | Specifically, pick the opcode with the correct branch prediction, i.e. jump:t or jump:nt. llvm-svn: 296821
* [ARM] Fix insert point for store rescheduling.Eli Friedman2017-03-021-12/+19
| | | | | | | | | | | | | | | | In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would move stores to the wrong insert point. Re-commit with a fix to increment NumMove in the right place. Differential Revision: https://reviews.llvm.org/D30124 llvm-svn: 296815
* [PPC] Fix code generation for bswap(int32) followed by store16Guozhi Wei2017-03-021-2/+10
| | | | | | | | | | | | | | | | | | | This patch fixes pr32063. Current code in PPCTargetLowering::PerformDAGCombine can transform bswap store into a single PPCISD::STBRX instruction. but it doesn't consider the case that the operand size of bswap may be larger than store size. When it occurs, we need 2 modifications, 1 For the last operand of PPCISD::STBRX, we should not use DAG.getValueType(N->getOperand(1).getValueType()), instead we should use cast<StoreSDNode>(N)->getMemoryVT(). 2 Before PPCISD::STBRX, we need to shift the original operand of bswap to the right side. Differential Revision: https://reviews.llvm.org/D30362 llvm-svn: 296811
* [AArch64] Extend redundant copy elimination pass to handle non-zero stores.Chad Rosier2017-03-021-76/+212
| | | | | | | | | | | | | | | This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle non-zero cases such as: BB#0: cmp x0, #1 b.eq .LBB0_1 .LBB0_1: orr x0, xzr, #0x1 ; <-- redundant copy; x0 known to hold #1. Differential Revision: https://reviews.llvm.org/D29344 llvm-svn: 296809
OpenPOWER on IntegriCloud