summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/SystemZ
Commit message (Collapse)AuthorAgeFilesLines
* [SystemZ] Make more use of LTGFRRichard Sandiford2013-12-131-0/+48
| | | | | | | | | | | | InstCombine turns (sext (trunc)) into (ashr (shl)), then converts any comparison of the ashr against zero into a comparison of the shl against zero. This makes sense in itself, but we want to undo it for z, since the sign- extension instruction has a CC-setting form. I've included tests for both the original and InstCombined variants, but the former already worked. The patch fixes the latter. llvm-svn: 197234
* [SystemZ] Optimize fcmp X, 0 in cases where X is also negatedRichard Sandiford2013-12-111-0/+40
| | | | | | | In such cases it's often better to test the result of the negation instead, since the negation also sets CC. llvm-svn: 197032
* Extend (truncate (load)) foldingRichard Sandiford2013-12-111-0/+14
| | | | | | | | | DAGCombiner could fold (truncate (load)) -> smaller load if the original load was the width of the truncation result or wider. This patch extends it to handle cases where the original load was narrower (and so the extension type stays the same). llvm-svn: 197030
* Add TargetLowering::prepareVolatileOrAtomicLoadRichard Sandiford2013-12-1012-46/+28
| | | | | | | | | | | | | | | | | One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196906
* Add TargetLowering::prepareVolatileOrAtomicLoadRichard Sandiford2013-12-1014-47/+49
| | | | | | | | | | | | | | | | | One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196905
* [SystemZ] Use LOAD AND TEST for comparisons with -0Richard Sandiford2013-12-061-0/+19
| | | | | | ...since it os equivalent to comparison with +0. llvm-svn: 196580
* [SystemZ] Extend the use of C(L)GFRRichard Sandiford2013-12-062-4/+43
| | | | | | | instcombine prefers to put extended operands first, so this patch handles that case for C(L)GFR. llvm-svn: 196579
* [SystemZ] Optimize selects between 0 and -1Richard Sandiford2013-12-063-0/+543
| | | | | | | | | | | | | Since z has no setcc instruction as such, the choice of setBooleanContents is a bit arbitrary. Currently it's set to ZeroOrOneBooleanContent, so we produced a branch-free form when selecting between 0 and 1, but not when selecting between 0 and -1. This patch handles the latter case too. At some point I'd like to measure whether it's better to use conditional moves for constant selects on z196, but that's future work. llvm-svn: 196578
* [SystemZ] Fix choice of known-zero mask in insertion optimizationRichard Sandiford2013-12-031-0/+13
| | | | | | | | | | | | | | The backend converts 64-bit ORs into subreg moves if the upper 32 bits of one operand and the low 32 bits of the other are known to be zero. It then tries to peel away redundant ANDs from the upper 32 bits. Since AND masks are canonicalized to exclude known-zero bits, the test ORs the mask and the known-zero bits together before checking for redundancy. The problem was that it was using the wrong node when checking for known-zero bits, so could drop ANDs that were still needed. llvm-svn: 196267
* [SystemZ] Fix incorrect use of RISBG for a zero-extended right shiftRichard Sandiford2013-11-261-0/+14
| | | | | | | | | We would wrongly transform the testcase into the equivalent of an AND with 1. The problem was that, when testing whether the shifted-in bits of the right shift were significant, we used the width of the final zero-extended result rather than the width of the shifted value. llvm-svn: 195731
* [SystemZ] Fix TMHH and TMHL usage for z10 with -O0Richard Sandiford2013-11-222-0/+50
| | | | | | | | | | | | I've no idea why I decided to handle TMxx differently from all the other high/low logic operations, but it was a stupid thing to do. The high registers aren't available as separate 32-bit registers on z10, so subreg_h32 can't be used on a GR64 there. I've normally been testing with z196 and with -O3 and so hadn't noticed this until now. llvm-svn: 195473
* [SystemZ] Automatically detect zEC12 and z196 hostsRichard Sandiford2013-10-3120-42/+72
| | | | | | | | | | As on other hosts, the CPU identification instruction is priveleged, so we need to look through /proc/cpuinfo. I copied the PowerPC way of handling "generic". Several tests were implicitly assuming z10 and so failed on z196. llvm-svn: 193742
* [SystemZ] Set usaAA to trueRichard Sandiford2013-10-285-18/+19
| | | | | | | | | | | | | | | | useAA significantly improves the handling of vector code that has TBAA information attached. It also helps other cases, as shown by the testsuite changes here. The only real downside I've seen is that it interferes with MergeConsecutiveStores. The problem is that that optimization works top down, starting at the first store in the chain, and looks for cases where the chain result is only used by a single related store. These related stores don't alias, so useAA will have rewritten all the later stores to use a different chain input (typically the same one as the first store). I think the advantages outweigh the disadvantages though, so for now I've just disabled alias analysis for the unaligned-01.ll test. llvm-svn: 193521
* [DAGCombiner] Respect volatility when checking for aliasesRichard Sandiford2013-10-281-1/+2
| | | | | | | | Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518
* Keep TBAA info when rewriting SelectionDAG loads and storesRichard Sandiford2013-10-281-0/+20
| | | | | | | | | | | | | | | | | Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517
* Replace sra with srl if a single sign bit is requiredRichard Sandiford2013-10-171-0/+12
| | | | | | E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2). llvm-svn: 192884
* [SystemZ] Handle extensions in RxSBG optimizationsRichard Sandiford2013-10-161-3/+2
| | | | | | | The input to an RxSBG operation can be narrower as long as the upper bits are don't care. This fixes a FIXME added in r192783. llvm-svn: 192790
* [SystemZ] Improve handling of SETCCRichard Sandiford2013-10-163-16/+260
| | | | | | | | We previously used the default expansion to SELECT_CC, which in turn would expand to "LHI; BRC; LHI". In most cases it's better to use an IPM-based sequence instead. llvm-svn: 192784
* Handle (shl (anyext (shr ...))) in SimpilfyDemandedBitsRichard Sandiford2013-10-161-0/+67
| | | | | | | | | | This is really an extension of the current (shl (shr ...)) -> shl optimization. The main difference is that certain upper bits must also not be demanded. The motivating examples are the first two in the testcase, which occur in llvmpipe output. llvm-svn: 192783
* [SystemZ] Use A(G)SI when spilling the target of a constant additionRichard Sandiford2013-10-152-0/+332
| | | | llvm-svn: 192681
* [SystemZ] Add comparisons of high words and memoryRichard Sandiford2013-10-011-0/+40
| | | | llvm-svn: 191777
* [SystemZ] Add comparisons of large immediates using high wordsRichard Sandiford2013-10-011-0/+34
| | | | | | | There are no corresponding patterns for small immediates because they would prevent the use of fused compare-and-branch instructions. llvm-svn: 191775
* [SystemZ] Add immediate addition involving high wordsRichard Sandiford2013-10-011-0/+115
| | | | llvm-svn: 191774
* [SystemZ] Extend test-under-mask support to high GR32sRichard Sandiford2013-10-011-0/+29
| | | | llvm-svn: 191773
* [SystemZ] Extend 32-bit RISBG optimizations to high wordsRichard Sandiford2013-10-011-0/+25
| | | | | | | This involves using RISB[LH]G, whereas the equivalent z10 optimization uses RISBG. llvm-svn: 191770
* [SystemZ] Extend pseudo conditional 8- and 16-bit stores to high wordsRichard Sandiford2013-10-011-0/+32
| | | | | | As the comment says, we always want to use STOC for 32-bit stores. llvm-svn: 191767
* [SystemZ] Add test missing from r191764.Richard Sandiford2013-10-011-0/+30
| | | | llvm-svn: 191765
* [SystemZ] Allow integer AND involving high wordsRichard Sandiford2013-10-011-0/+63
| | | | llvm-svn: 191762
* [SystemZ] Allow integer XOR involving high wordsRichard Sandiford2013-10-011-0/+42
| | | | llvm-svn: 191759
* [SystemZ] Allow integer OR involving high wordsRichard Sandiford2013-10-011-0/+42
| | | | llvm-svn: 191755
* [SystemZ] Allow integer insertions with a high-word destinationRichard Sandiford2013-10-011-0/+38
| | | | llvm-svn: 191753
* [SystemZ] Allow selects with a high-word destinationRichard Sandiford2013-10-011-0/+28
| | | | llvm-svn: 191751
* [SystemZ] Add patterns to load a constant into a high word (IIHF)Richard Sandiford2013-10-011-0/+57
| | | | | | | Similar to low words, we can use the shorter LLIHL and LLIHH if it turns out that the other half of the GR64 isn't live. llvm-svn: 191750
* [SystemZ] Add register zero extensions involving at least one high wordRichard Sandiford2013-10-012-0/+388
| | | | llvm-svn: 191746
* [SystemZ] Add truncating high-word stores (STCH and STHH)Richard Sandiford2013-10-011-0/+46
| | | | llvm-svn: 191743
* [SystemZ] Add zero-extending high-word loads (LLCH and LLHH)Richard Sandiford2013-10-011-0/+48
| | | | llvm-svn: 191742
* [SystemZ] Add sign-extending high-word loads (LBH and LHH)Richard Sandiford2013-10-011-0/+48
| | | | llvm-svn: 191740
* [SystemZ] Use upper words of GR64s for codegenRichard Sandiford2013-10-011-0/+52
| | | | | | | | | | | | | | This just adds the basics necessary for allocating the upper words to virtual registers (move, load and store). The move support is parameterised in a way that makes it easy to handle zero extensions, but the associated zero-extend patterns are added by a later patch. The easiest way of testing this seemed to be add a new "h" register constraint for high words. I don't expect the constraint to be useful in real inline asms, but it should work, so I didn't try to hide it behind an option. llvm-svn: 191739
* TBAA: update tbaa format from scalar format to struct-path aware format.Manman Ren2013-09-301-6/+8
| | | | llvm-svn: 191690
* TBAA: handle scalar TBAA format and struct-path aware TBAA format.Manman Ren2013-09-271-2/+4
| | | | | | | | | | | | | | | | Remove the command line argument "struct-path-tbaa" since we should not depend on command line argument to decide which format the IR file is using. Instead, we check the first operand of the tbaa tag node, if it is a MDNode, we treat it as struct-path aware TBAA format, otherwise, we treat it as scalar TBAA format. When clang starts to use struct-path aware TBAA format no matter whether struct-path-tbaa is no, and we can auto-upgrade existing bc files, the support for scalar TBAA format can be dropped. Existing testing cases are updated to use the struct-path aware TBAA format. llvm-svn: 191538
* [SystemZ] Rein back the use of block operationsRichard Sandiford2013-09-273-121/+73
| | | | | | | | | | | | | | The backend tries to use block operations like MVC, NC, OC and XC for simple scalar operations. For correctness reasons, it rejects any case in which the regions might partially overlap. However, for performance reasons, it should also reject cases where the regions might be equal, since the instruction might then not use the fast path. This fixes a performance regression seen in bzip2. We may want to limit the optimisation even more in future, or even remove it entirely, but I'll try with this for now. llvm-svn: 191525
* [SystemZ] Improve handling of PC-relative addressesRichard Sandiford2013-09-271-0/+35
| | | | | | | | | | | | | | The backend previously folded offsets into PC-relative addresses whereever possible. That's the right thing to do when the address can be used directly in a PC-relative memory reference (using things like LRL). But if we have a register-based memory reference and need to load the PC-relative address separately, it's better to use an anchor point that could be shared with other accesses to the same area of the variable. Fixes a FIXME. llvm-svn: 191524
* [SystemZ] Add unsigned compare-and-branch instructionsRichard Sandiford2013-09-1825-154/+781
| | | | | | | | | | | | | | | For some reason I never got around to adding these at the same time as the signed versions. No idea why. I'm not sure whether this SystemZII::BranchC* stuff is useful, or whether it should just be replaced with an "is normal" flag. I'll leave that for later though. There are some boundary conditions that can be tweaked, such as preferring unsigned comparisons for equality with [128, 256), and "<= 255" over "< 256", but again I'll leave those for a separate patch. llvm-svn: 190930
* [SystemZ] Improve extload handlingRichard Sandiford2013-09-161-2/+2
| | | | | | | | | | | | The port originally had special patterns for extload, mapping them to the same instructions as sextload. It seemed neater to have patterns that match "an extension that is allowed to be signed" and "an extension that is allowed to be unsigned". This was originally meant to be a clean-up, but it does improve the handling of promoted integers a little, as shown by args-06.ll. llvm-svn: 190777
* [SystemZ] Try to fold shifts into TMxxRichard Sandiford2013-09-132-0/+80
| | | | | | E.g. "SRL %r2, 2; TMLL %r2, 1" => "TMLL %r2, 4". llvm-svn: 190672
* [SystemZ] Add TM and TMYRichard Sandiford2013-09-101-0/+245
| | | | | | | | | | | | | | | | | | | | | | | The main complication here is that TM and TMY (the memory forms) set CC differently from the register forms. When the tested bits contain some 0s and some 1s, the register forms set CC to 1 or 2 based on the value the uppermost bit. The memory forms instead set CC to 1 regardless of the uppermost bit. Until now, I've tried to make it so that a branch never tests for an impossible CC value. E.g. NR only sets CC to 0 or 1, so branches on the result will only test for 0 or 1. Originally I'd tried to do the same thing for TM and TMY by using custom matching code in ISelDAGToDAG. That ended up being very ugly though, and would have meant duplicating some of the chain checks that the common isel code does. I've therefore gone for the simpler alternative of adding an extra operand to the TM DAG opcode to say whether a memory form would be OK. This means that the inverse of a "TM;JE" is "TM;JNE" rather than the more precise "TM;JNLE", just like the inverse of "TMLL;JE" is "TMLL;JNE". I suppose that's arguably less confusing though... llvm-svn: 190400
* [SystemZ] Tweak integer comparison codeRichard Sandiford2013-09-061-0/+101
| | | | | | | | | | | | | | | | | | | The architecture has many comparison instructions, including some that extend one of the operands. The signed comparison instructions use sign extensions and the unsigned comparison instructions use zero extensions. In cases where we had a free choice between signed or unsigned comparisons, we were trying to decide at lowering time which would best fit the available instructions, taking things like extension type into account. The code to do that was getting increasingly hairy and was also making some bad decisions. E.g. when comparing the result of two LLCs, it is better to use CR rather than CLR, since CR can be fused with a branch while CLR can't. This patch removes the lowering code and instead adds an operand to integer comparisons to say whether signed comparison is required, whether unsigned comparison is required, or whether either is OK. We can then leave the choice of instruction up to the normal isel code. llvm-svn: 190138
* [SystemZ] Use XC for a memset of 0Richard Sandiford2013-09-061-42/+26
| | | | llvm-svn: 190130
* [SystemZ] Add NC, OC and XCRichard Sandiford2013-09-053-0/+511
| | | | | | | For now these are just used to handle scalar ANDs, ORs and XORs in which all operands are memory. llvm-svn: 190041
* [SystemZ] Add support for TMHH, TMHL, TMLH and TMLLRichard Sandiford2013-09-031-0/+352
| | | | | | | | | For now this just handles simple comparisons of an ANDed value with zero. The CC value provides enough information to do any comparison for a 2-bit mask, and some nonzero comparisons with more populated masks, but that's all future work. llvm-svn: 189819
OpenPOWER on IntegriCloud