summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/SystemZ
Commit message (Collapse)AuthorAgeFilesLines
...
* [SystemZ] Add remaining branch instructionsUlrich Weigand2016-11-281-0/+38
| | | | | | | | | | | | | | | | | This patch adds assembler support for the remaining branch instructions: the non-relative branch on count variants, and all variants of branch on index. The only one of those that can be readily exploited for code generation is BRCTH (branch on count using a high 32-bit register as count). Do use it, however, it is necessary to also introduce a hew CHIMux pseudo to allow comparisons of a 32-bit value agains a short immediate to go into a high register as well (implemented via CHI/CIH). This causes a bit of codegen changes overall, but those have proven to be neutral (or even beneficial) in performance measurements. llvm-svn: 288029
* [SystemZ] Improve use of conditional instructionsUlrich Weigand2016-11-288-24/+738
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves formation of LOC-type instructions from (late) IfConversion to the early if-conversion pass, and in some cases additionally creates them directly from select instructions during DAG instruction selection. To make early if-conversion work, the patch implements the canInsertSelect / insertSelect callbacks. It also implements the commuteInstructionImpl and FoldImmediate callbacks to enable generation of the full range of LOC instructions. Finally, the patch adds support for all instructions of the load-store-on-condition-2 facility, which allows using LOC instructions also for high registers. Due to the use of the GRX32 register class to enable high registers, we now also have to handle the cases where there are still no single hardware instructions (conditional move from a low register to a high register or vice versa). These are converted back to a branch sequence after register allocation. Since the expandRAPseudos callback is not allowed to create new basic blocks, this requires a simple new pass, modelled after the ARM/AArch64 ExpandPseudos pass. Overall, this patch causes significantly more LOC-type instructions to be used, and results in a measurable performance improvement. llvm-svn: 288028
* [SystemZ] Support CL(G)T instructionsUlrich Weigand2016-11-111-0/+90
| | | | | | | | This adds support for the compare logical and trap (memory) instructions that were added as part of the miscellaneous instruction extensions feature with zEC12. llvm-svn: 286587
* [SystemZ] Support load-and-zero-rightmost-byte facilityUlrich Weigand2016-11-111-0/+278
| | | | | | | | | | This adds support for the LZRF/LZRG/LLZRGF instructions that were added on z13, and uses them for code generation were appropriate. SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF over RISBG where both would be possible. llvm-svn: 286586
* [SystemZ] Use LLGT(R) instructionsUlrich Weigand2016-11-111-0/+133
| | | | | | | | | | | | | This adds support for the 31-to-64-bit zero extension instructions LLGT and LLGTR and uses them for code generation where appropriate. Since this operation can also be performed via RISBG, we have to update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT over RISBG in case both are possible. The patch includes some simplification to the tryRISBGZero code; this is not intended to cause any (further) functional change in codegen. llvm-svn: 286585
* ScheduleDAGInstrs: Add condjump deps to addSchedBarrierDeps()Matthias Braun2016-11-111-1/+1
| | | | | | | | | | | | | | | addSchedBarrierDeps() is supposed to add use operands to the ExitSU node. The current implementation adds uses for calls/barrier instruction and the MBB live-outs in all other cases. The use operands of conditional jump instructions were missed. Also added code to macrofusion to set the latencies between nodes to zero to avoid problems with the fusing nodes lingering around in the pending list now. Differential Revision: https://reviews.llvm.org/D25140 llvm-svn: 286544
* [SystemZ] Do not use LOC(G) for volatile loadsUlrich Weigand2016-10-252-0/+28
| | | | | | | | | | | | | | It is not safe to use LOAD ON CONDITION to implement access to a memory location marked "volatile", since the architecture leaves it unspecified whether or not an access happens if the condition is false. The current code already appears to care about that: def LOC : CondUnaryRSY<"loc", 0xEBF2, nonvolatile_load, GR32, 4>; Unfortunately, that "nonvolatile_load" operator is simply ignored by the CondUnaryRSY class, and there was no test to catch it. llvm-svn: 285077
* [SystemZ] Post-RA scheduler implementationJonas Paulsson2016-10-202-19/+19
| | | | | | | | | | | | | | | | Post-RA sched strategy and scheduling instruction annotations for z196, zEC12 and z13. This scheduler optimizes decoder grouping and balances processor resources (including side steering the FPd unit instructions). The SystemZHazardRecognizer keeps track of the scheduling state, which can be dumped with -debug-only=misched. Reviers: Ulrich Weigand, Andrew Trick. https://reviews.llvm.org/D17260 llvm-svn: 284704
* [DAG] optimize negation of boolSanjay Patel2016-10-193-11/+11
| | | | | | | | | | | | | | | | Use mask and negate for legalization of i1 source type with SIGN_EXTEND_INREG. With the mask, this should be no worse than 2 shifts. The mask can be eliminated in some cases, so that should be better than 2 shifts. This change exposed some missing folds related to negation: https://reviews.llvm.org/rL284239 https://reviews.llvm.org/rL284395 There may be others, so please let me know if you see any regressions. Differential Revision: https://reviews.llvm.org/D25485 llvm-svn: 284611
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-10-131-1/+4
| | | | | | | | | UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-10-131-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 284151
* [DAGCombiner] Do not remove the load of stored values when optimizations are ↵Konstantin Zhuravlyov2016-10-121-2/+1
| | | | | | | | | | | | | | | | | | | | disabled This combiner breaks debug experience and should not be run when optimizations are disabled. For example: int main() { int j = 0; j += 2; if (j == 2) return 0; return 5; } When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function. Differential Revision: https://reviews.llvm.org/D19268 llvm-svn: 284014
* swifterror: Don't compute swifterror vregs during instruction selectionArnold Schwaighofer2016-10-071-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | The code used llvm basic block predecessors to decided where to insert phi nodes. Instruction selection can and will liberally insert new machine basic block predecessors. There is not a guaranteed one-to-one mapping from pred. llvm basic blocks and machine basic blocks. Therefore the current approach does not work as it assumes we can mark predecessor machine basic block as needing a copy, and needs to know the set of all predecessor machine basic blocks to decide when to insert phis. Instead of computing the swifterror vregs as we select instructions, propagate them at the end of instruction selection when the MBB CFG is complete. When an instruction needs a swifterror vreg and we don't know the value yet, generate a new vreg and remember this "upward exposed" use, and reconcile this at the end of instruction selection. This will only happen if the target supports promoting swifterror parameters to registers and the swifterror attribute is used. rdar://28300923 llvm-svn: 283617
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-09-281-1/+4
| | | | | | | | UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-09-281-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600
* [DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombineMichael Kuperstein2016-09-281-2/+2
| | | | | | | | | | | | This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567
* [SystemZ] Use valid base/index regs for inline asmZhan Jun Liau2016-08-181-2/+2
| | | | | | | | | | | | | | | Summary: Inline asm memory constraints can have the base or index register be assigned to %r0 right now. Make sure that we assign only ADDR64 registers to the base and index. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23367 llvm-svn: 279157
* Fix SystemZ compilation abort caused by negative AND maskElliot Colp2016-08-181-0/+22
| | | | | | | | | | Normally, when an AND with a constant is lowered to NILL, the constant value is truncated to 16 bits. However, since r274066, ANDs whose results are used in a shift are caught by a different pattern that does not truncate. The instruction printer expects a 16-bit unsigned immediate operand for NILL, so this results in an abort. This patch adds code to manually truncate the constant in this situation. The rest of the bits are then set, so we will detect a case for NILL "naturally" rather than using peephole optimizations. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 279105
* [LoopStrenghtReduce] Refactoring and addition of a new target cost function.Jonas Paulsson2016-08-171-0/+117
| | | | | | | | | | | | | | | | | | | | | | | Refactored so that a LSRUse owns its fixups, as oppsed to letting the LSRInstance own them. This makes it easier to rate formulas for LSRUses, since the fixups are available directly. The Offsets vector has been removed since it was no longer necessary. New target hook isFoldableMemAccessOffset(), which is used during formula rating. For SystemZ, this is useful to express that loads and stores with float or vector types with a big/negative offset should be avoided in loops. Without this, LSR will generate a lot of negative offsets that would require extra instructions for loading the address. Updated tests: test/CodeGen/SystemZ/loop-01.ll Reviewed by: Quentin Colombet and Ulrich Weigand. https://reviews.llvm.org/D19152 llvm-svn: 278927
* Re-add SystemZ SNaN testElliot Colp2016-08-081-0/+15
| | | | | | | The floating-point bug affecting ninja-x64-msvc-RA-centos6 is fixed (r277813) so this test should now pass llvm-svn: 278034
* I can't reproduce this buildbot failure locally, so temporarily remove this ↵Elliot Colp2016-08-031-15/+0
| | | | | | | | test while I investigate. http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/27427 llvm-svn: 277636
* Disable shrinking of SNaN constantsElliot Colp2016-08-031-0/+15
| | | | | | | | | When expanding FP constants, we attempt to shrink doubles to floats and perform an extending load. However, on SystemZ, and possibly on other targets (I've only confirmed the problem on SystemZ), the FP extending load instruction may convert SNaN into QNaN, or may cause an exception. So in the general case, we would still like to shrink FP constants, but SNaNs should be left as doubles. Differential Revision: https://reviews.llvm.org/D22685 llvm-svn: 277602
* Tests: Add branch weights to non-layout tests.Kyle Butt2016-07-291-3/+5
| | | | | | | Add branch weights to a few tests that aren't testing layout to make them less sensitive to changes in the layout algorithm. llvm-svn: 277186
* Revert "RegScavenging: Add scavengeRegisterBackwards()"Matthias Braun2016-07-204-12/+12
| | | | | | | | | Reverting this commit for now as it seems to be causing failures on test-suite tests on the clang-ppc64le-linux-lnt bot. This reverts commit r276044. llvm-svn: 276068
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2016-07-194-12/+12
| | | | | | | | | | | | | | This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 276044
* [SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunitiesZhan Jun Liau2016-07-111-0/+23
| | | | | | | | | | | | | | | | | | Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 llvm-svn: 275086
* [SystemZ] Utilize Test Data Class instructions.Marcin Koscielnicki2016-07-106-0/+560
| | | | | | | | | | | This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32|64|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016
* [SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotateElliot Colp2016-07-063-2/+194
| | | | | | | | | | On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount. Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we can remove the AND operation entirely. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 274650
* [SystemZ] Let z13 also support FeatureMiscellaneousExtensions.Jonas Paulsson2016-06-301-1/+1
| | | | | | | | | | | This processor feature had been left out by mistake from the z13 ProcessorModel. This time with updated test case. Thanks, Hans. Reviewed by Ulrich Weigand. llvm-svn: 274216
* [SystemZ] Use NILL instruction instead of NILF where possibleZhan Jun Liau2016-06-282-0/+98
| | | | | | | | | | | | | | | | | | | Summary: SystemZ shift instructions only use the last 6 bits of the shift amount. When the result of an AND operation is used as a shift amount, this means that we can use the NILL instruction (which operates on the last 16 bits) rather than NILF (which operates on the last 32 bits) for a 16-bit savings in instruction size. Reviewers: uweigand Subscribers: llvm-commits Author: colpell Committing on behalf of Elliot. Differential Revision: http://reviews.llvm.org/D21686 llvm-svn: 274066
* [SystemZ] Avoid generating 2 XOR instructions for (and (xor x, -1), y)Zhan Jun Liau2016-06-271-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Created a pattern to match 64-bit mode (and (xor x, -1), y) to a shorter sequence of instructions. Before the change, the canonical form is translated to: xihf %r3, 4294967295 xilf %r3, 4294967295 ngr %r2, %r3 After the change, the canonical form is translated to: ngr %r3, %r2 xgr %r2, %r3 Reviewers: zhanjunl, uweigand Subscribers: llvm-commits Author: assem Committing on behalf of Assem. Differential Revision: http://reviews.llvm.org/D21693 llvm-svn: 273887
* Uses shouldAssumeDSOLocal.Rafael Espindola2016-06-231-0/+13
| | | | | | With that SystemZ knows to avoid a GOT for PIE. llvm-svn: 273614
* [SystemZ] Recognize RISBG opportunities involving a truncateZhan Jun Liau2016-06-222-0/+46
| | | | | | | | | | | | | | | | | | | Summary: Recognize RISBG opportunities where the end result is narrower than the original input - where a truncate separates the shift/and operations. The motivating case is some code in postgres which looks like: srlg %r2, %r0, 11 nilh %r2, 255 Reviewers: uweigand Author: RolandF Differential Revision: http://reviews.llvm.org/D21452 llvm-svn: 273433
* [SelectionDAG] Don't treat library calls specially if marked with nobuiltin.Marcin Koscielnicki2016-06-175-0/+328
| | | | | | | | To be used by D19781. Differential Revision: http://reviews.llvm.org/D19801 llvm-svn: 273039
* [SystemZ] Enable index register memory constraints for inline ASMUlrich Weigand2016-06-133-8/+95
| | | | | | | | | | | | | | | | This enables use of the 'R' and 'T' memory constraints for inline ASM operands on SystemZ, which allow an index register as well as an immediate displacement. This patch includes corresponding documentation and test case updates. As with the last patch of this kind, I moved the 'm' constraint to the most general case, which is now 'T' (base + 20-bit signed displacement + index register). Author: colpell Differential Revision: http://reviews.llvm.org/D21239 llvm-svn: 272547
* [SystemZ] Support Compare and TrapsZhan Jun Liau2016-06-101-0/+179
| | | | | | | | | | | | Support and generate Compare and Traps like CRT, CIT, etc. Support Trap as legal DAG opcodes and generate "j .+2" for them by default. Add support for Conditional Traps and use the If Converter to convert them into the corresponding compare and trap opcodes. Differential Revision: http://reviews.llvm.org/D21155 llvm-svn: 272419
* [SystemZ] Enable long displacement constraints for inline ASM operandsUlrich Weigand2016-06-093-5/+39
| | | | | | | | | | | | | | | | | | This enables use of the 'S' constraint for inline ASM operands on SystemZ, which allows for a memory reference with a signed 20-bit immediate displacement. This patch includes corresponding documentation and test case updates. I've changed the 'T' constraint to match the new behavior for 'S', as 'T' also uses a long displacement (though index constraints are still not implemented). I also changed 'm' to match the behavior for 'S' as this will allow for a wider range of displacements for 'm', though correct me if that's not the right decision. Author: colpell Differential Revision: http://reviews.llvm.org/D21097 llvm-svn: 272266
* [SystemZ] Fix register ordering for BinaryRRF instructionsBryan Chan2016-05-181-9/+9
| | | | | | | | | | | | | | | | | Summary: The ordering of registers in BinaryRRF instructions are wrong, and affects the copysign instruction (CPSDR). This results in the wrong magnitude and sign being set. Author: zhanjunl Reviewers: kbarton, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20308 llvm-svn: 269922
* [SystemZ] Support LRVH and STRVH opcodesBryan Chan2016-05-162-0/+199
| | | | | | | | | | | | Summary: On Linux, /usr/include/bits/byteswap-16.h defines __byteswap_16(x) as an inlined LRVH (Load Reversed Half-word) instruction. The SystemZ back-end did not support this opcode and the inlined assembly would cause a fatal error. Reviewers: bryanpkc, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18732 llvm-svn: 269688
* [PR27599] [SystemZ] [SelectionDAG] Fix extension of atomic cmpxchg result.Marcin Koscielnicki2016-05-101-0/+81
| | | | | | | | | | | | Currently, SelectionDAG assumes 8/16-bit cmpxchg returns either a sign extended result, or a zero extended result. SystemZ takes a third option by returning junk in the high bits (rotated contents of the other bytes in the memory word). In that case, don't use Assert*ext, and zero-extend the result ourselves if a comparison is needed. Differential Revision: http://reviews.llvm.org/D19800 llvm-svn: 269075
* [foldMemoryOperand()] Pass LiveIntervals to enable liveness check.Jonas Paulsson2016-05-101-1/+1
| | | | | | | | | | | | | | | SystemZ (and probably other targets as well) can fold a memory operand by changing the opcode into a new instruction that as a side-effect also clobbers the CC-reg. In order to do this, liveness of that reg must first be checked. When LIS is passed, getRegUnit() can be called on it and the right LiveRange is computed on demand. Reviewed by Matthias Braun. http://reviews.llvm.org/D19861 llvm-svn: 269026
* [SystemZ] Implement backchain attribute (recommit with fix).Marcin Koscielnicki2016-05-051-0/+84
| | | | | | | | | | | | | This introduces a SystemZ-specific "backchain" attribute on function, which enables writing the frame backchain link as specified by the ABI. This will be used to implement -mbackchain option in clang. Differential Revision: http://reviews.llvm.org/D19889 Fixed in this version: added RegState::Define and RegState::Kill on R1D in prologue. llvm-svn: 268581
* Revert "[SystemZ] Implement backchain attribute."Marcin Koscielnicki2016-05-041-84/+0
| | | | | | | | This reverts commit rL268571. It caused failures in register scavenger. llvm-svn: 268576
* [SystemZ] Implement llvm.get.dynamic.area.offsetMarcin Koscielnicki2016-05-041-0/+42
| | | | | | | | To be used for AddressSanitizer. Differential Revision: http://reviews.llvm.org/D19817 llvm-svn: 268572
* [SystemZ] Implement backchain attribute.Marcin Koscielnicki2016-05-041-0/+84
| | | | | | | | | | This introduces a SystemZ-specific "backchain" attribute on function, which enables writing the frame backchain link as specified by the ABI. This will be used to implement -mbackchain option in clang. Differential Revision: http://reviews.llvm.org/D19889 llvm-svn: 268571
* [SystemZ] Temporarily disable codegen test int-add-12.ll.Jonas Paulsson2016-05-021-1/+1
| | | | | | This checks for AGSI transformation, which is temporarily disabled. llvm-svn: 268219
* DAGCombiner: Reduce truncated shl widthMatt Arsenault2016-04-2918-278/+277
| | | | llvm-svn: 268094
* [SystemZ] Support Swift Calling ConventionBryan Chan2016-04-283-0/+627
| | | | | | | | | | | | | | | | Summary: Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see: RFC: Implementing the Swift calling convention in LLVM and Clang https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0 Reviewers: kbarton, manmanren, rjmccall, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19414 llvm-svn: 267823
* [SystemZ] [SSP] Add support for LOAD_STACK_GUARD.Marcin Koscielnicki2016-04-241-0/+35
| | | | | | | | | | | | | | This fixes PR22248 on s390x. The previous attempt at this was D19101, which was before LOAD_STACK_GUARD existed. Compared to the previous version, this always emits a rather ugly block of 4 instructions, involving a thread pointer load that can't be shared with other potential users. However, this is necessary for SSP - spilling the guard value (or thread pointer used to load it) is counter to the goal, since it could be overwritten along with the frame it protects. Differential Revision: http://reviews.llvm.org/D19363 llvm-svn: 267340
* [SystemZ] Add support for llvm.thread.pointer intrinsic.Marcin Koscielnicki2016-04-201-0/+14
| | | | | | Differential Revision: http://reviews.llvm.org/D19054 llvm-svn: 266844
OpenPOWER on IntegriCloud