summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [SystemZ] Simplify handling of 128-bit multiply/divide instructionUlrich Weigand2017-07-051-55/+36
| | | | | | | | | | | Several integer multiply/divide instructions require use of a register pair as input and output. This patch moves setting up the input register pair from C++ code to TableGen, simplifying the whole process and making it more easily extensible. No functional change. llvm-svn: 307155
* [SystemZ] Add a check against zero before calling getTestUnderMaskCond()Jonas Paulsson2017-06-261-0/+2
| | | | | | | | | | | | | | Csmith discovered that this function can be called with a zero argument, in which case an assert for this triggered. This patch also adds a guard before the other call to this function since it was missing, although the test only covers the case where it was discovered. Reduced test case attached as CodeGen/SystemZ/int-cmp-54.ll. Review: Ulrich Weigand llvm-svn: 306287
* [SystemZ] Remove unnecessary serialization before volatile loadsUlrich Weigand2017-06-231-8/+5
| | | | | | | | | | | | | | | | | This reverts the use of TargetLowering::prepareVolatileOrAtomicLoad introduced by r196905. Nothing in the semantics of the "volatile" keyword or the definition of the z/Architecture actually requires that volatile loads are preceded by a serialization operation, and no other compiler on the platform actually implements this. Since we've now seen a use case where this additional serialization causes noticable performance degradation, this patch removes it. The patch still leaves in the serialization before atomic loads, which is now implemented directly in lowerATOMIC_LOAD. (This also seems overkill, but that can be addressed separately.) llvm-svn: 306117
* [SystemZ] Propagate MachineMemOperandsJonas Paulsson2017-06-071-6/+19
| | | | | | | In emitCondStore() and emitMemMemWrapper(). Review: Ulrich Weigand llvm-svn: 304913
* [SystemZ] Improve buildVector() in SystemZISelLowering.cpp.Jonas Paulsson2017-05-291-19/+41
| | | | | | | | Use VLREP when inserting one or more loads into a vector. This is more efficient than to first load and then use a VLVGP. Review: Ulrich Weigand llvm-svn: 304152
* [SystemZ] Implement getRepRegClassFor()Jonas Paulsson2017-05-101-0/+9
| | | | | | | | | | | | This method must return a valid register class, or the list-ilp isel scheduler will crash. For MVT::Untyped nullptr was previously returned, but now ADDR128BitRegClass is returned instead. This is needed just as long as list-ilp (and probably also list-hybrid) is still there. Review: Ulrich Weigand, A Trick https://reviews.llvm.org/D32802 llvm-svn: 302649
* Add extra operand to CALLSEQ_START to keep frame part set up previouslySerge Pavlov2017-05-091-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527
* [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and ↵Craig Topper2017-04-281-9/+10
| | | | | | | | | | | | simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620
* DAG: Make mayBeEmittedAsTailCall parameter constMatt Arsenault2017-04-181-1/+1
| | | | llvm-svn: 300603
* [SystemZ] TargetTransformInfo cost functions implemented.Jonas Paulsson2017-04-121-0/+4
| | | | | | | | | | | | | | | | getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
* [SystemZ] Check for presence of vector support in SystemZISelLoweringJonas Paulsson2017-04-071-2/+5
| | | | | | | | | | | | | | A test case was found with llvm-stress that caused DAGCombiner to crash when compiling for an older subtarget without vector support. SystemZTargetLowering::combineTruncateExtract() should do nothing for older subtargets. This check was placed in canTreatAsByteVector(), which also helps in a few other places. Review: Ulrich Weigand llvm-svn: 299763
* [SystemZ] Remove confusing comment in combineEXTRACT_VECTOR_ELT()Jonas Paulsson2017-04-071-2/+0
| | | | | | It isn't just one-element vectors that can appear here. llvm-svn: 299762
* [SystemZ] Prevent Merging Bitcast with non-normal loadsNirav Dave2017-04-051-2/+3
| | | | | | | | | | | | Fixes PR32505. Reviewers: uweigand, jonpa Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31609 llvm-svn: 299552
* [SystemZ] Skip DAGCombining of vector node for older subtargets.Jonas Paulsson2017-03-311-0/+6
| | | | | | | | | | | | | Even on older subtargets that lack vector support, there may be vector values with just one element in the input program. These are converted during DAG legalization to scalar values. The pre-legalize SystemZ DAGCombiner methods should in this circumstance not touch these nodes. This patch adds a check for this in SystemZTargetLowering::combineEXTRACT_VECTOR_ELT(). Review: Ulrich Weigand llvm-svn: 299213
* [SystemZ] Add check VT.isSimple() in canTreateAsByteVector()Jonas Paulsson2017-03-071-1/+1
| | | | | | | | Since BB-vectorizer can produce vectors of for example 3 elements, this check is needed. Review: Ulrich Weigand llvm-svn: 297136
* [SystemZ] Add comment for ISD::FP_TO_UINT expansion.Jonas Paulsson2017-02-021-0/+3
| | | | | | | (Copied from the fp-conv-10.ll test to SystemZISelLowering.cpp) Review: Ulrich Weigand llvm-svn: 293900
* [SystemZ] Gracefully fail in GeneralShuffle::add() instead of assertion.Jonas Paulsson2017-01-241-12/+22
| | | | | | | | | | | | | | The GeneralShuffle::add() method used to have an assert that made sure that source elements were at least as big as the destination elements. This was wrong, since it is actually expected that an EXTRACT_VECTOR_ELT node with a smaller source element type than the return type gets extended. Therefore, instead of asserting this, it is just checked and if this is the case 'false' is returned from the GeneralShuffle::add() method. This case should be very rare and is not handled further by the backend. Review: Ulrich Weigand. llvm-svn: 292888
* [CodeGen] Rename MachineInstrBuilder::addOperand. NFCDiana Picus2017-01-131-19/+37
| | | | | | | | | | | Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891
* [SystemZ] Improve isFoldableMemAccessOffset().Jonas Paulsson2017-01-111-2/+20
| | | | | | | | | | | A store of an extracted element or a load which gets inserted into a vector, will be combined into a vector load/store element instruction. Therefore, isFoldableMemAccessOffset(), which is called by LSR, should return false in these cases. Reviewer: Ulrich Weigand llvm-svn: 291673
* [SystemZ] Improve use of conditional instructionsUlrich Weigand2016-11-281-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves formation of LOC-type instructions from (late) IfConversion to the early if-conversion pass, and in some cases additionally creates them directly from select instructions during DAG instruction selection. To make early if-conversion work, the patch implements the canInsertSelect / insertSelect callbacks. It also implements the commuteInstructionImpl and FoldImmediate callbacks to enable generation of the full range of LOC instructions. Finally, the patch adds support for all instructions of the load-store-on-condition-2 facility, which allows using LOC instructions also for high registers. Due to the use of the GRX32 register class to enable high registers, we now also have to handle the cases where there are still no single hardware instructions (conditional move from a low register to a high register or vice versa). These are converted back to a branch sequence after register allocation. Since the expandRAPseudos callback is not allowed to create new basic blocks, this requires a simple new pass, modelled after the ARM/AArch64 ExpandPseudos pass. Overall, this patch causes significantly more LOC-type instructions to be used, and results in a measurable performance improvement. llvm-svn: 288028
* [SystemZ] Model access registers as LLVM registersUlrich Weigand2016-11-081-5/+3
| | | | | | | | | | | | | Add the 16 access registers as LLVM registers. This allows removing a lot of special cases in the assembler and disassembler where we were handling access registers; this can all just use the generic register code now. Also add a bunch of instructions to operate on access registers, for assembler/disassembler use only. No change in code generation intended. llvm-svn: 286283
* [MachineMemOperand] Move synchronization scope and atomic orderings from ↵Konstantin Zhuravlyov2016-10-151-2/+1
| | | | | | | | SDNode to MachineMemOperand, and remove redundant getAtomic* member functions from SelectionDAG. Differential Revision: https://reviews.llvm.org/D24577 llvm-svn: 284312
* getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCISanjay Patel2016-09-141-6/+6
| | | | llvm-svn: 281495
* getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCISanjay Patel2016-09-141-7/+5
| | | | llvm-svn: 281493
* Fix SystemZ hang caused by r279105Elliot Colp2016-08-231-29/+24
| | | | | | | | | The change in r279105 causes an infinite loop in some cases, as it sets the upper bits of an AND mask constant, which DAGCombiner::SimplifyDemandedBits then unsets. This patch reverts that part of the behaviour, instead relying on .td peepholes to perform the transformation to NILL. I reapplied my original fix for the problem addressed by r279105 (unsetting the upper bits, which prevents a compiler abort for a different reason). Differential Revision: https://reviews.llvm.org/D23781 llvm-svn: 279515
* [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> froundMichael Kuperstein2016-08-181-2/+2
| | | | | | | | | | The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129
* Fix SystemZ compilation abort caused by negative AND maskElliot Colp2016-08-181-3/+34
| | | | | | | | | | Normally, when an AND with a constant is lowered to NILL, the constant value is truncated to 16 bits. However, since r274066, ANDs whose results are used in a shift are caught by a different pattern that does not truncate. The instruction printer expects a 16-bit unsigned immediate operand for NILL, so this results in an abort. This patch adds code to manually truncate the constant in this situation. The rest of the bits are then set, so we will detect a case for NILL "naturally" rather than using peephole optimizations. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 279105
* [LoopStrenghtReduce] Refactoring and addition of a new target cost function.Jonas Paulsson2016-08-171-0/+23
| | | | | | | | | | | | | | | | | | | | | | | Refactored so that a LSRUse owns its fixups, as oppsed to letting the LSRInstance own them. This makes it easier to rate formulas for LSRUses, since the fixups are available directly. The Offsets vector has been removed since it was no longer necessary. New target hook isFoldableMemAccessOffset(), which is used during formula rating. For SystemZ, this is useful to express that loads and stores with float or vector types with a big/negative offset should be avoided in loops. Without this, LSR will generate a lot of negative offsets that would require extra instructions for loading the address. Updated tests: test/CodeGen/SystemZ/loop-01.ll Reviewed by: Quentin Colombet and Ulrich Weigand. https://reviews.llvm.org/D19152 llvm-svn: 278927
* MachineFunction: Return reference for getFrameInfo(); NFCMatthias Braun2016-07-281-11/+11
| | | | | | | getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
* [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, ↵Justin Lebar2016-07-151-47/+31
| | | | | | | | | | | | | | | | | | | | | | | getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592
* [SystemZ] Utilize Test Data Class instructions.Marcin Koscielnicki2016-07-101-0/+5
| | | | | | | | | | | This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32|64|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016
* [SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotateElliot Colp2016-07-061-1/+54
| | | | | | | | | | On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount. Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we can remove the AND operation entirely. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 274650
* [SystemZ] Move misplaced SystemZ::TDC to non-memory opcode range.Marcin Koscielnicki2016-07-021-1/+1
| | | | llvm-svn: 274417
* CodeGen: Use MachineInstr& in TargetLowering, NFCDuncan P. N. Exon Smith2016-06-301-125/+117
| | | | | | | | | | | | | This is a mechanical change to make TargetLowering API take MachineInstr& (instead of MachineInstr*), since the argument is expected to be a valid MachineInstr. In one case, changed a parameter from MachineInstr* to MachineBasicBlock::iterator, since it was used as an insertion point. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. llvm-svn: 274287
* [SystemZ] Split up PerformDAGCombine. [NFC]Marcin Koscielnicki2016-06-301-142/+176
| | | | | | This function is already a bit too long, and I'm about to make it worse. llvm-svn: 274191
* [SystemZ] Add floating-point test data class instructions.Marcin Koscielnicki2016-06-291-0/+1
| | | | | | | These are not used by CodeGen yet - ISD combiners creating the new node will come in subsequent patches. llvm-svn: 274108
* Move shouldAssumeDSOLocal to Target.Rafael Espindola2016-06-271-2/+1
| | | | | | Should fix the shared library build. llvm-svn: 273958
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-45/+48
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* [SystemZ] Support Compare and TrapsZhan Jun Liau2016-06-101-0/+3
| | | | | | | | | | | | Support and generate Compare and Traps like CRT, CIT, etc. Support Trap as legal DAG opcodes and generate "j .+2" for them by default. Add support for Conditional Traps and use the If Converter to convert them into the corresponding compare and trap opcodes. Differential Revision: http://reviews.llvm.org/D21155 llvm-svn: 272419
* [SystemZ] Support LRVH and STRVH opcodesBryan Chan2016-05-161-0/+71
| | | | | | | | | | | | Summary: On Linux, /usr/include/bits/byteswap-16.h defines __byteswap_16(x) as an inlined LRVH (Load Reversed Half-word) instruction. The SystemZ back-end did not support this opcode and the inlined assembly would cause a fatal error. Reviewers: bryanpkc, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18732 llvm-svn: 269688
* [SystemZ] Implement backchain attribute (recommit with fix).Marcin Koscielnicki2016-05-051-4/+33
| | | | | | | | | | | | | This introduces a SystemZ-specific "backchain" attribute on function, which enables writing the frame backchain link as specified by the ABI. This will be used to implement -mbackchain option in clang. Differential Revision: http://reviews.llvm.org/D19889 Fixed in this version: added RegState::Define and RegState::Kill on R1D in prologue. llvm-svn: 268581
* Revert "[SystemZ] Implement backchain attribute."Marcin Koscielnicki2016-05-041-33/+4
| | | | | | | | This reverts commit rL268571. It caused failures in register scavenger. llvm-svn: 268576
* [SystemZ] Implement llvm.get.dynamic.area.offsetMarcin Koscielnicki2016-05-041-0/+10
| | | | | | | | To be used for AddressSanitizer. Differential Revision: http://reviews.llvm.org/D19817 llvm-svn: 268572
* [SystemZ] Implement backchain attribute.Marcin Koscielnicki2016-05-041-4/+33
| | | | | | | | | | This introduces a SystemZ-specific "backchain" attribute on function, which enables writing the frame backchain link as specified by the ABI. This will be used to implement -mbackchain option in clang. Differential Revision: http://reviews.llvm.org/D19889 llvm-svn: 268571
* [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in ↵Craig Topper2016-04-281-4/+0
| | | | | | TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853
* [SystemZ] Support Swift Calling ConventionBryan Chan2016-04-281-3/+7
| | | | | | | | | | | | | | | | Summary: Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see: RFC: Implementing the Swift calling convention in LLVM and Clang https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0 Reviewers: kbarton, manmanren, rjmccall, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19414 llvm-svn: 267823
* [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.Ahmed Bougacha2016-04-261-6/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
* [SystemZ] Mark CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF as Expand instead of Custom ↵Craig Topper2016-04-221-8/+2
| | | | | | since the custom logic just did what Expand does when CTTZ/CTLZ are Legal. NFC llvm-svn: 267109
* [SystemZ] Add support for llvm.thread.pointer intrinsic.Marcin Koscielnicki2016-04-201-8/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D19054 llvm-svn: 266844
* NFC: make AtomicOrdering an enum classJF Bastien2016-04-061-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602
OpenPOWER on IntegriCloud