summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/SystemZ/vector-constrained-fp-intrinsics.ll
Commit message (Collapse)AuthorAgeFilesLines
* [FPEnv] Fix chain handling for fpexcept.strict nodesUlrich Weigand2020-01-131-274/+226
| | | | | | | | | | | | | | | | | We need to ensure that fpexcept.strict nodes are not optimized away even if the result is unused. To do that, we need to chain them into the block's terminator nodes, like already done for PendingExcepts. This patch adds two new lists of pending chains, PendingConstrainedFP and PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without and with fpexcept.strict markers. This allows not only to solve the above problem, but also to relax chains a bit further by no longer flushing all FP nodes before a store or other memory access. (They are still flushed before nodes with other side effects.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72341
* [SystemZ][FPEnv] Enable strict vector FP extends/truncationsUlrich Weigand2019-12-201-12/+9
| | | | | | | | | | | | | | The back-end currently has special DAGCombine code to detect cases where two floating-point extend or truncate operations can be combined into a single vector operation. This patch extends that support to also handle strict FP operations. Note that currently only the case where both operations have the same input chain are supported. This already suffices to cover the common case where the operations result from scalarizing a non-legal vector type. More general cases can be supported in the future.
* [FPEnv] Remove unnecessary rounding mode argument for constrained intrinsicsUlrich Weigand2019-12-171-56/+30
| | | | | | | | | | | | | | | | | | | The following intrinsics currently carry a rounding mode metadata argument: llvm.experimental.constrained.minnum llvm.experimental.constrained.maxnum llvm.experimental.constrained.ceil llvm.experimental.constrained.floor llvm.experimental.constrained.round llvm.experimental.constrained.trunc This is not useful since the semantics of those intrinsics do not in any way depend on the rounding mode. In similar cases, other constrained intrinsics do not have the rounding mode argument. Remove it here as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71218
* [FPEnv][SelectionDAG] Relax chain requirementsUlrich Weigand2019-12-061-383/+423
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the following changes: 1) SelectionDAGBuilder::visitConstrainedFPIntrinsic currently treats each constrained intrinsic like a global barrier (e.g. a function call) and fully serializes all pending chains. This is actually not required; it is allowed for constrained intrinsics to be reordered w.r.t one another or (nonvolatile) memory accesses. The MI-level scheduler already allows for that flexibility, so it makes sense to allow it at the DAG level as well. This patch therefore changes the way chains for constrained intrisincs are created, and handles them basically like load operations are handled. This has the effect that constrained intrinsics are no longer serialized against one another or (nonvolatile) loads. They are still serialized against stores, but that seems hard to change with the current DAG chain setup, and it also doesn't seem to be a big problem preventing DAG 2) The OPC_CheckFoldableChainNode check requires that each of the intermediate nodes in a multi-node pattern match only has a single use. This check tends to fail if those intermediate nodes are strict operations as those have a chain output that typically indeed has another use. However, we don't really need to consider chains here at all, since they will all be rewritten anyway by UpdateChains later. Other parts of the matcher therefore already ignore chains, but this hasOneUse check doesn't. This patch replaces hasOneUse by a custom test that verifies there is no more than one use of any non-chain output value. In theory, this change could affect code unrelated to strict FP nodes, but at least on SystemZ I could not find any single instance of that happening 3) The SystemZ back-end currently does not allow matching multiply-and- extend operations (32x32 -> 64bit or 64x64 -> 128bit FP multiply) for strict FP operations. This was not possible in the past due to the problems described under 1) and 2) above. With those issues fixed, it is now possible to fully support those instructions in strict mode as well, and this patch does so. Differential Revision: https://reviews.llvm.org/D70913
* [LegalizeDAG] Convert strict fp nodes to libcalls without losing the chain.Craig Topper2019-11-181-165/+116
| | | | | | | | | | Previously we mutated the node and then converted it to a libcall. But this loses the chain information. This patch keeps the chain, but unfortunately breaks tail call optimization as the functions involved in deciding if a node is in tail call position can't handle the chain. But correct ordering seems more important to be right. Somehow the SystemZ tests improved. I looked at one of them and it seemed that we're handling the split vector elements in a different order and that made the copies work better. Differential Revision: https://reviews.llvm.org/D70334
* [FPEnv] Strict FP tests should use the requisite function attributes.Kevin P. Neal2019-10-041-238/+240
| | | | | | | | | | | | | | | A set of function attributes is required in any function that uses constrained floating point intrinsics. None of our tests use these attributes. This patch fixes this. These tests have been tested against the IR verifier changes in D68233. Reviewed by: andrew.w.kaylor, cameron.mcinally, uweigand Approved by: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D67925 llvm-svn: 373761
* [Strict FP] Allow more relaxed schedulingUlrich Weigand2019-07-161-15/+15
| | | | | | | | | | | | | | Reimplement scheduling constraints for strict FP instructions in ScheduleDAGInstrs::buildSchedGraph to allow for more relaxed scheduling. Specifially, allow one strict FP instruction to be scheduled across another, as long as it is not moved across any global barrier. Differential Revision: https://reviews.llvm.org/D64412 Reviewed By: cameron.mcinally llvm-svn: 366222
* Allow matching extend-from-memory with strict FP nodesUlrich Weigand2019-06-261-33/+19
| | | | | | | | | | | | | | | | | | | | | This implements a small enhancement to https://reviews.llvm.org/D55506 Specifically, while we were able to match strict FP nodes for floating-point extend operations with a register as source, this did not work for operations with memory as source. That is because from regular operations, this is represented as a combined "extload" node (which is a variant of a load SD node); but there is no equivalent using a strict FP operation. However, it turns out that even in the absence of an extload node, we can still just match the operations explicitly, e.g. (strict_fpextend (f32 (load node:$ptr)) This patch implements that method to match the LDEB/LXEB/LXDB SystemZ instructions even when the extend uses a strict-FP node. llvm-svn: 364450
* [SystemZ] Support vector load/store alignment hintsUlrich Weigand2019-06-191-296/+296
| | | | | | | | | | | | | Vector load/store instructions support an optional alignment field that the compiler can use to provide known alignment info to the hardware. If the field is used (and the information is correct), the hardware may be able (on some models) to perform faster memory accesses than otherwise. This patch adds support for alignment hints in the assembler and disassembler, and fills in known alignment during codegen. llvm-svn: 363806
* Allow target to handle STRICT floating-point nodesUlrich Weigand2019-06-051-162/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is *both* marked as mayRaiseFPException *and* FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663
* Revert revert of r362112 with minor SystemZ test file corrections.Kevin P. Neal2019-05-311-0/+6449
| | | | | | | | | | | | | [FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully. Submitted by: Drew Wock <drew.wock@sas.com> Reviewed by: Cameron McInally, Kevin P. Neal Approved by: Cameron McInally Differential Revision: https://reviews.llvm.org/D62546 llvm-svn: 362241
* Revert r362112, it broke the bots with the message "Unsupported vector ↵Kevin P. Neal2019-05-301-6445/+0
| | | | | | | | argument or return type" Differential Revision: http://reviews.llvm.org/D62546 llvm-svn: 362117
* [FPEnv] Added a special UnrollVectorOp method to deal with the chain on ↵Kevin P. Neal2019-05-301-0/+6445
StrictFP opcodes This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully. Submitted by: Drew Wock <drew.wock@sas.com> Reviewed by: Cameron McInally, Kevin P. Neal Approved by: Cameron McInally Differential Revision: http://reviews.llvm.org/D62546 llvm-svn: 362112
OpenPOWER on IntegriCloud