summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/cgp-bitfield-extract.ll
Commit message (Collapse)AuthorAgeFilesLines
* [DAGCombiner] re-enable truncation of binopsSanjay Patel2018-12-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is effectively re-committing the changes from: rL347917 (D54640) rL348195 (D55126) ...which were effectively reverted here: rL348604 ...because the code had a bug that could induce infinite looping or eventual out-of-memory compilation. The bug was that this code did not guard against transforming opaque constants. More details are in the post-commit mailing list thread for r347917. A reduced test for that is included in the x86 bool-math.ll file. (I wasn't able to reduce a PPC backend test for this, but it was almost the same pattern.) Original commit message for r347917: The motivating case for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=32023 and the corresponding rot16.ll regression tests. Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc sequences that don't get folded in IR. As the TODO comments suggest, there will be regressions if we extend this (for x86, we mostly seem to be missing LEA opportunities, but there are likely vector folds missing too). I think those should be considered existing bugs because this is the same transform that we do as an IR canonicalization in instcombine. We just need more tests to make those visible independent of this patch. llvm-svn: 348706
* [DAGCombiner] disable truncation of binops by defaultSanjay Patel2018-12-071-2/+2
| | | | | | | | | | As discussed in the post-commit thread of r347917, this transform is fighting with an existing transform causing an infinite loop or out-of-memory, so this is effectively reverting r347917 and its follow-up r348195 while we investigate the bug. llvm-svn: 348604
* [DAGCombiner] narrow truncated binopsSanjay Patel2018-11-291-2/+2
| | | | | | | | | | | | | | | | | | | The motivating case for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=32023 and the corresponding rot16.ll regression tests. Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc sequences that don't get folded in IR. As the TODO comments suggest, there will be regressions if we extend this (for x86, we mostly seem to be missing LEA opportunities, but there are likely vector folds missing too). I think those should be considered existing bugs because this is the same transform that we do as an IR canonicalization in instcombine. We just need more tests to make those visible independent of this patch. Differential Revision: https://reviews.llvm.org/D54640 llvm-svn: 347917
* [AMDGPU] Add pattern for v_alignbit_b32 with immediateStanislav Mekhanoshin2017-06-281-5/+4
| | | | | | | | If immediate in shift is less than 32 we can use alignbit too. Differential Revision: https://reviews.llvm.org/D34729 llvm-svn: 306500
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-6/+6
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* [AMDGPU] Remove getBidirectionalReasonRankStanislav Mekhanoshin2017-03-111-3/+4
| | | | | | | | | | | | | | This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 llvm-svn: 297536
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-2/+2
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* AMDGPU: Select branch on undef to uniform scc branchMatt Arsenault2016-12-151-6/+7
| | | | llvm-svn: 289877
* AMDGPU: Add VI i16 supportTom Stellard2016-11-101-2/+7
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
* AMDGPU: Remove unnecessary and on conditional branchMatt Arsenault2016-11-071-2/+1
| | | | | | | The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134
* Revert "AMDGPU: Add VI i16 support"Tom Stellard2016-11-041-7/+2
| | | | | | This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995
* AMDGPU: Add VI i16 supportTom Stellard2016-11-031-2/+7
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939
* DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit componentMatt Arsenault2016-04-211-4/+2
| | | | | | | If the extracted bits are restricted to the upper half or lower half, this can be truncated. llvm-svn: 267024
* AMDGPU: Set HasExtractBitInsnMatt Arsenault2016-03-011-0/+303
This currently does not have the control over the bitwidth, and there are missing optimizations to reduce the integer to 32-bit if it can be. But in most situations we do want the sinking to occur. llvm-svn: 262296
OpenPOWER on IntegriCloud