summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/SystemZ
Commit message (Collapse)AuthorAgeFilesLines
...
* IR: add "cmpxchg weak" variant to support permitted failure.Tim Northover2014-06-134-25/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
* Reduce verbiage of lit.local.cfg filesAlp Toker2014-06-092-4/+2
| | | | | | We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496
* Reenable use of TBAA during CodeGenHal Finkel2014-04-121-3/+0
| | | | | | | | | | | | | | | | | | | | We had disabled use of TBAA during CodeGen (even when otherwise using AA) because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to miss basic type punning that it should catch (and, thus, we'd fail to override TBAA when we should). However, when AA is in use during CodeGen, CGP now uses normal GEPs and bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a result, BasicAA should be able to make us do the right thing in the face of type-punning, and it seems safe to enable use of TBAA again. self-hosting seems fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle. Note: We still don't update TBAA when merging stack slots, although because BasicAA should now catch all such cases, this is no longer a blocking issue. Nevertheless, I plan to commit code to deal with this properly in the near future. llvm-svn: 206093
* [SystemZ] Add support for z196 float<->unsigned conversionsRichard Sandiford2014-03-216-8/+135
| | | | | | These complement the older float<->signed instructions. llvm-svn: 204451
* IR: add a second ordering operand to cmpxhg for failureTim Northover2014-03-114-25/+25
| | | | | | | | | | | | | | | The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
* Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵Daniel Sanders2014-02-131-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333
* XFAIL test/CodeGen/SystemZ/alias-01.ll which requires CodeGen TBAAHal Finkel2014-01-251-0/+3
| | | | llvm-svn: 200094
* Fix known typosAlp Toker2014-01-244-5/+5
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* [SystemZ] Flesh out stackrestore test (frame-11.ll)Richard Sandiford2014-01-131-3/+10
| | | | | | ...so that it does something vaguely sensible. llvm-svn: 199117
* [SystemZ] Improve risbg-01.ll testRichard Sandiford2014-01-131-6/+5
| | | | | | | | | | | | | | The old mask in f24 wasn't well chosen because the lshr would always be zero. CodeGen didn't detect this but InstCombine would. The new mask ensures that both shifts are needed. f26 is specifically testing for a wrap-around mask. The AND can be applied to just the shift left, either before or after the shift. Again, CodeGen kept it in the original form but InstCombine would mask after the shift instead. The exact choice of NILF isn't important for the test so I just dropped it and kept the rotate. llvm-svn: 199115
* [SystemZ] Optimize (sext (ashr (shl ...), ...))Richard Sandiford2014-01-131-3/+16
| | | | | | | | | | ...into (ashr (shl (anyext X), ...), ...), which requires one fewer instruction. The (anyext X) can sometimes be simplified too. I didn't do this in DAGCombiner because widening shifts isn't a win on all targets. llvm-svn: 199114
* [SystemZ] Fix RNSBG bug introduced by r197802Richard Sandiford2014-01-093-0/+33
| | | | | | | | The zext handling added in r197802 wasn't right for RNSBG. This patch restricts it to ROSBG, RXSBG and RISBG. (The tests for RISBG were added in r197802 since RISBG was the motivating example.) llvm-svn: 198862
* Handle masked rotate amountsRichard Sandiford2014-01-091-0/+72
| | | | | | | | | | | | | | At the moment we expect rotates to have the form: (or (shl X, Y), (shr X, Z)) where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C (including 0, the most natural choice). llvm-svn: 198861
* Match the InstCombine form of rotates by X+CRichard Sandiford2014-01-091-0/+29
| | | | | | | | | | | | | | | | | InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X), so a rotate left of a 32-bit Y by X+C could appear as either: (or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C)))) without InstCombine or: (or (shl Y, (add X, C)), (shr Y, (sub 32-C, X))) with it. We already matched the first form. This patch handles the second too. llvm-svn: 198860
* [SystemZ] Use interlocked-access 1 instructions for CodeGenRichard Sandiford2013-12-2410-0/+650
| | | | | | | | | ...namely LOAD AND ADD, LOAD AND AND, LOAD AND OR and LOAD AND EXCLUSIVE OR. LOAD AND ADD LOGICAL isn't really separately useful for LLVM. I'll look at adding reusing the CC results in new year. llvm-svn: 197985
* [SystemZ] Optimize comparisons with truncated extended loadsRichard Sandiford2013-12-201-0/+22
| | | | | | | | | | | | If the extension of a loaded value is compared against zero and used in other arithmetic, InstCombine will change the comparison to use the unextended load. It's also possible that the comparison could be against the unextended load from the outset. In DAG form this becomes a truncation of an extending load. We want to strip the truncation if possible so that we can use load-and-test instructions. llvm-svn: 197804
* [SystemZ] Extend RISBG optimizationRichard Sandiford2013-12-201-2/+13
| | | | | | | | | | | | The handling of ANY_EXTEND and ZERO_EXTEND was too strict. In this context we can treat ZERO_EXTEND in much the same way as an AND and then also handle outermost ZERO_EXTENDs. I couldn't find a test that benefited from the ANY_EXTEND change, but it's more obvious to write it this way once SIGN_EXTEND and ZERO_EXTEND are handled differently. llvm-svn: 197802
* Revert "Add -mcpu=z10 to SystemZ tests."Andrew Trick2013-12-184-4/+4
| | | | | | | | | | This reverts commit r197466. The MachineCSE fix that required the -mcpu flag has been disabled until more work can be done to fix downstream issues. Adding -mcpu wasn't the right workaround anyway. llvm-svn: 197624
* Add -mcpu=z10 to SystemZ tests.Andrew Trick2013-12-174-4/+4
| | | | llvm-svn: 197466
* [SystemZ] Optimize X [!=]= Y in cases where X - Y or Y - X is also computedRichard Sandiford2013-12-131-0/+20
| | | | | | | In those cases it's better to compare the result of the subtraction against zero. llvm-svn: 197239
* [SystemZ] Make more use of TMHHRichard Sandiford2013-12-131-0/+109
| | | | | | | | | | This originally came about after noticing that InstCombine turns some of the TMHH (icmp (and...), ...) tests into plain comparisons. Since there is no instruction to compare with a 64-bit immediate, TMHH is generally better than an ordered comparison for the cases that it can handle. llvm-svn: 197238
* [SystemZ] Extend integer absolute selectionRichard Sandiford2013-12-132-0/+197
| | | | | | | | This patch makes more use of LPGFR and LNGFR. It builds on top of the LTGFR selection from r197234. Most of the tests are motivated by what InstCombine would produce. llvm-svn: 197236
* [SystemZ] Make more use of LTGFRRichard Sandiford2013-12-131-0/+48
| | | | | | | | | | | | InstCombine turns (sext (trunc)) into (ashr (shl)), then converts any comparison of the ashr against zero into a comparison of the shl against zero. This makes sense in itself, but we want to undo it for z, since the sign- extension instruction has a CC-setting form. I've included tests for both the original and InstCombined variants, but the former already worked. The patch fixes the latter. llvm-svn: 197234
* [SystemZ] Optimize fcmp X, 0 in cases where X is also negatedRichard Sandiford2013-12-111-0/+40
| | | | | | | In such cases it's often better to test the result of the negation instead, since the negation also sets CC. llvm-svn: 197032
* Extend (truncate (load)) foldingRichard Sandiford2013-12-111-0/+14
| | | | | | | | | DAGCombiner could fold (truncate (load)) -> smaller load if the original load was the width of the truncation result or wider. This patch extends it to handle cases where the original load was narrower (and so the extension type stays the same). llvm-svn: 197030
* Add TargetLowering::prepareVolatileOrAtomicLoadRichard Sandiford2013-12-1012-46/+28
| | | | | | | | | | | | | | | | | One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196906
* Add TargetLowering::prepareVolatileOrAtomicLoadRichard Sandiford2013-12-1014-47/+49
| | | | | | | | | | | | | | | | | One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196905
* [SystemZ] Use LOAD AND TEST for comparisons with -0Richard Sandiford2013-12-061-0/+19
| | | | | | ...since it os equivalent to comparison with +0. llvm-svn: 196580
* [SystemZ] Extend the use of C(L)GFRRichard Sandiford2013-12-062-4/+43
| | | | | | | instcombine prefers to put extended operands first, so this patch handles that case for C(L)GFR. llvm-svn: 196579
* [SystemZ] Optimize selects between 0 and -1Richard Sandiford2013-12-063-0/+543
| | | | | | | | | | | | | Since z has no setcc instruction as such, the choice of setBooleanContents is a bit arbitrary. Currently it's set to ZeroOrOneBooleanContent, so we produced a branch-free form when selecting between 0 and 1, but not when selecting between 0 and -1. This patch handles the latter case too. At some point I'd like to measure whether it's better to use conditional moves for constant selects on z196, but that's future work. llvm-svn: 196578
* [SystemZ] Fix choice of known-zero mask in insertion optimizationRichard Sandiford2013-12-031-0/+13
| | | | | | | | | | | | | | The backend converts 64-bit ORs into subreg moves if the upper 32 bits of one operand and the low 32 bits of the other are known to be zero. It then tries to peel away redundant ANDs from the upper 32 bits. Since AND masks are canonicalized to exclude known-zero bits, the test ORs the mask and the known-zero bits together before checking for redundancy. The problem was that it was using the wrong node when checking for known-zero bits, so could drop ANDs that were still needed. llvm-svn: 196267
* [SystemZ] Fix incorrect use of RISBG for a zero-extended right shiftRichard Sandiford2013-11-261-0/+14
| | | | | | | | | We would wrongly transform the testcase into the equivalent of an AND with 1. The problem was that, when testing whether the shifted-in bits of the right shift were significant, we used the width of the final zero-extended result rather than the width of the shifted value. llvm-svn: 195731
* [SystemZ] Fix TMHH and TMHL usage for z10 with -O0Richard Sandiford2013-11-222-0/+50
| | | | | | | | | | | | I've no idea why I decided to handle TMxx differently from all the other high/low logic operations, but it was a stupid thing to do. The high registers aren't available as separate 32-bit registers on z10, so subreg_h32 can't be used on a GR64 there. I've normally been testing with z196 and with -O3 and so hadn't noticed this until now. llvm-svn: 195473
* [SystemZ] Automatically detect zEC12 and z196 hostsRichard Sandiford2013-10-3120-42/+72
| | | | | | | | | | As on other hosts, the CPU identification instruction is priveleged, so we need to look through /proc/cpuinfo. I copied the PowerPC way of handling "generic". Several tests were implicitly assuming z10 and so failed on z196. llvm-svn: 193742
* [SystemZ] Set usaAA to trueRichard Sandiford2013-10-285-18/+19
| | | | | | | | | | | | | | | | useAA significantly improves the handling of vector code that has TBAA information attached. It also helps other cases, as shown by the testsuite changes here. The only real downside I've seen is that it interferes with MergeConsecutiveStores. The problem is that that optimization works top down, starting at the first store in the chain, and looks for cases where the chain result is only used by a single related store. These related stores don't alias, so useAA will have rewritten all the later stores to use a different chain input (typically the same one as the first store). I think the advantages outweigh the disadvantages though, so for now I've just disabled alias analysis for the unaligned-01.ll test. llvm-svn: 193521
* [DAGCombiner] Respect volatility when checking for aliasesRichard Sandiford2013-10-281-1/+2
| | | | | | | | Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518
* Keep TBAA info when rewriting SelectionDAG loads and storesRichard Sandiford2013-10-281-0/+20
| | | | | | | | | | | | | | | | | Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517
* Replace sra with srl if a single sign bit is requiredRichard Sandiford2013-10-171-0/+12
| | | | | | E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2). llvm-svn: 192884
* [SystemZ] Handle extensions in RxSBG optimizationsRichard Sandiford2013-10-161-3/+2
| | | | | | | The input to an RxSBG operation can be narrower as long as the upper bits are don't care. This fixes a FIXME added in r192783. llvm-svn: 192790
* [SystemZ] Improve handling of SETCCRichard Sandiford2013-10-163-16/+260
| | | | | | | | We previously used the default expansion to SELECT_CC, which in turn would expand to "LHI; BRC; LHI". In most cases it's better to use an IPM-based sequence instead. llvm-svn: 192784
* Handle (shl (anyext (shr ...))) in SimpilfyDemandedBitsRichard Sandiford2013-10-161-0/+67
| | | | | | | | | | This is really an extension of the current (shl (shr ...)) -> shl optimization. The main difference is that certain upper bits must also not be demanded. The motivating examples are the first two in the testcase, which occur in llvmpipe output. llvm-svn: 192783
* [SystemZ] Use A(G)SI when spilling the target of a constant additionRichard Sandiford2013-10-152-0/+332
| | | | llvm-svn: 192681
* [SystemZ] Add comparisons of high words and memoryRichard Sandiford2013-10-011-0/+40
| | | | llvm-svn: 191777
* [SystemZ] Add comparisons of large immediates using high wordsRichard Sandiford2013-10-011-0/+34
| | | | | | | There are no corresponding patterns for small immediates because they would prevent the use of fused compare-and-branch instructions. llvm-svn: 191775
* [SystemZ] Add immediate addition involving high wordsRichard Sandiford2013-10-011-0/+115
| | | | llvm-svn: 191774
* [SystemZ] Extend test-under-mask support to high GR32sRichard Sandiford2013-10-011-0/+29
| | | | llvm-svn: 191773
* [SystemZ] Extend 32-bit RISBG optimizations to high wordsRichard Sandiford2013-10-011-0/+25
| | | | | | | This involves using RISB[LH]G, whereas the equivalent z10 optimization uses RISBG. llvm-svn: 191770
* [SystemZ] Extend pseudo conditional 8- and 16-bit stores to high wordsRichard Sandiford2013-10-011-0/+32
| | | | | | As the comment says, we always want to use STOC for 32-bit stores. llvm-svn: 191767
* [SystemZ] Add test missing from r191764.Richard Sandiford2013-10-011-0/+30
| | | | llvm-svn: 191765
* [SystemZ] Allow integer AND involving high wordsRichard Sandiford2013-10-011-0/+63
| | | | llvm-svn: 191762
OpenPOWER on IntegriCloud