summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/SystemZ
Commit message (Collapse)AuthorAgeFilesLines
...
* [SystemZ] Allow integer XOR involving high wordsRichard Sandiford2013-10-011-0/+42
| | | | llvm-svn: 191759
* [SystemZ] Allow integer OR involving high wordsRichard Sandiford2013-10-011-0/+42
| | | | llvm-svn: 191755
* [SystemZ] Allow integer insertions with a high-word destinationRichard Sandiford2013-10-011-0/+38
| | | | llvm-svn: 191753
* [SystemZ] Allow selects with a high-word destinationRichard Sandiford2013-10-011-0/+28
| | | | llvm-svn: 191751
* [SystemZ] Add patterns to load a constant into a high word (IIHF)Richard Sandiford2013-10-011-0/+57
| | | | | | | Similar to low words, we can use the shorter LLIHL and LLIHH if it turns out that the other half of the GR64 isn't live. llvm-svn: 191750
* [SystemZ] Add register zero extensions involving at least one high wordRichard Sandiford2013-10-012-0/+388
| | | | llvm-svn: 191746
* [SystemZ] Add truncating high-word stores (STCH and STHH)Richard Sandiford2013-10-011-0/+46
| | | | llvm-svn: 191743
* [SystemZ] Add zero-extending high-word loads (LLCH and LLHH)Richard Sandiford2013-10-011-0/+48
| | | | llvm-svn: 191742
* [SystemZ] Add sign-extending high-word loads (LBH and LHH)Richard Sandiford2013-10-011-0/+48
| | | | llvm-svn: 191740
* [SystemZ] Use upper words of GR64s for codegenRichard Sandiford2013-10-011-0/+52
| | | | | | | | | | | | | | This just adds the basics necessary for allocating the upper words to virtual registers (move, load and store). The move support is parameterised in a way that makes it easy to handle zero extensions, but the associated zero-extend patterns are added by a later patch. The easiest way of testing this seemed to be add a new "h" register constraint for high words. I don't expect the constraint to be useful in real inline asms, but it should work, so I didn't try to hide it behind an option. llvm-svn: 191739
* TBAA: update tbaa format from scalar format to struct-path aware format.Manman Ren2013-09-301-6/+8
| | | | llvm-svn: 191690
* TBAA: handle scalar TBAA format and struct-path aware TBAA format.Manman Ren2013-09-271-2/+4
| | | | | | | | | | | | | | | | Remove the command line argument "struct-path-tbaa" since we should not depend on command line argument to decide which format the IR file is using. Instead, we check the first operand of the tbaa tag node, if it is a MDNode, we treat it as struct-path aware TBAA format, otherwise, we treat it as scalar TBAA format. When clang starts to use struct-path aware TBAA format no matter whether struct-path-tbaa is no, and we can auto-upgrade existing bc files, the support for scalar TBAA format can be dropped. Existing testing cases are updated to use the struct-path aware TBAA format. llvm-svn: 191538
* [SystemZ] Rein back the use of block operationsRichard Sandiford2013-09-273-121/+73
| | | | | | | | | | | | | | The backend tries to use block operations like MVC, NC, OC and XC for simple scalar operations. For correctness reasons, it rejects any case in which the regions might partially overlap. However, for performance reasons, it should also reject cases where the regions might be equal, since the instruction might then not use the fast path. This fixes a performance regression seen in bzip2. We may want to limit the optimisation even more in future, or even remove it entirely, but I'll try with this for now. llvm-svn: 191525
* [SystemZ] Improve handling of PC-relative addressesRichard Sandiford2013-09-271-0/+35
| | | | | | | | | | | | | | The backend previously folded offsets into PC-relative addresses whereever possible. That's the right thing to do when the address can be used directly in a PC-relative memory reference (using things like LRL). But if we have a register-based memory reference and need to load the PC-relative address separately, it's better to use an anchor point that could be shared with other accesses to the same area of the variable. Fixes a FIXME. llvm-svn: 191524
* [SystemZ] Add unsigned compare-and-branch instructionsRichard Sandiford2013-09-1825-154/+781
| | | | | | | | | | | | | | | For some reason I never got around to adding these at the same time as the signed versions. No idea why. I'm not sure whether this SystemZII::BranchC* stuff is useful, or whether it should just be replaced with an "is normal" flag. I'll leave that for later though. There are some boundary conditions that can be tweaked, such as preferring unsigned comparisons for equality with [128, 256), and "<= 255" over "< 256", but again I'll leave those for a separate patch. llvm-svn: 190930
* [SystemZ] Improve extload handlingRichard Sandiford2013-09-161-2/+2
| | | | | | | | | | | | The port originally had special patterns for extload, mapping them to the same instructions as sextload. It seemed neater to have patterns that match "an extension that is allowed to be signed" and "an extension that is allowed to be unsigned". This was originally meant to be a clean-up, but it does improve the handling of promoted integers a little, as shown by args-06.ll. llvm-svn: 190777
* [SystemZ] Try to fold shifts into TMxxRichard Sandiford2013-09-132-0/+80
| | | | | | E.g. "SRL %r2, 2; TMLL %r2, 1" => "TMLL %r2, 4". llvm-svn: 190672
* [SystemZ] Add TM and TMYRichard Sandiford2013-09-101-0/+245
| | | | | | | | | | | | | | | | | | | | | | | The main complication here is that TM and TMY (the memory forms) set CC differently from the register forms. When the tested bits contain some 0s and some 1s, the register forms set CC to 1 or 2 based on the value the uppermost bit. The memory forms instead set CC to 1 regardless of the uppermost bit. Until now, I've tried to make it so that a branch never tests for an impossible CC value. E.g. NR only sets CC to 0 or 1, so branches on the result will only test for 0 or 1. Originally I'd tried to do the same thing for TM and TMY by using custom matching code in ISelDAGToDAG. That ended up being very ugly though, and would have meant duplicating some of the chain checks that the common isel code does. I've therefore gone for the simpler alternative of adding an extra operand to the TM DAG opcode to say whether a memory form would be OK. This means that the inverse of a "TM;JE" is "TM;JNE" rather than the more precise "TM;JNLE", just like the inverse of "TMLL;JE" is "TMLL;JNE". I suppose that's arguably less confusing though... llvm-svn: 190400
* [SystemZ] Tweak integer comparison codeRichard Sandiford2013-09-061-0/+101
| | | | | | | | | | | | | | | | | | | The architecture has many comparison instructions, including some that extend one of the operands. The signed comparison instructions use sign extensions and the unsigned comparison instructions use zero extensions. In cases where we had a free choice between signed or unsigned comparisons, we were trying to decide at lowering time which would best fit the available instructions, taking things like extension type into account. The code to do that was getting increasingly hairy and was also making some bad decisions. E.g. when comparing the result of two LLCs, it is better to use CR rather than CLR, since CR can be fused with a branch while CLR can't. This patch removes the lowering code and instead adds an operand to integer comparisons to say whether signed comparison is required, whether unsigned comparison is required, or whether either is OK. We can then leave the choice of instruction up to the normal isel code. llvm-svn: 190138
* [SystemZ] Use XC for a memset of 0Richard Sandiford2013-09-061-42/+26
| | | | llvm-svn: 190130
* [SystemZ] Add NC, OC and XCRichard Sandiford2013-09-053-0/+511
| | | | | | | For now these are just used to handle scalar ANDs, ORs and XORs in which all operands are memory. llvm-svn: 190041
* [SystemZ] Add support for TMHH, TMHL, TMLH and TMLLRichard Sandiford2013-09-031-0/+352
| | | | | | | | | For now this just handles simple comparisons of an ANDed value with zero. The CC value provides enough information to do any comparison for a 2-bit mask, and some nonzero comparisons with more populated masks, but that's all future work. llvm-svn: 189819
* [SystemZ] Add support for TMHH, TMHL, TMLH and TMLLRichard Sandiford2013-08-283-2/+294
| | | | | | | | | For now just handles simple comparisons of an ANDed value with zero. The CC value provides enough information to do any comparison for a 2-bit mask, and some nonzero comparisons with more populated masks, but that's all future work. llvm-svn: 189469
* [SystemZ] Extend memcmp support to all constant lengthsRichard Sandiford2013-08-282-4/+96
| | | | | | This uses the infrastructure added for memcpy and memmove in r189331. llvm-svn: 189458
* [SystemZ] Extend memcpy and memset support to all constant lengthsRichard Sandiford2013-08-275-29/+224
| | | | | | | | | | | | | | Lengths up to a certain threshold (currently 6 * 256) use a series of MVCs. Lengths above that threshold use a loop to handle X*256 bytes followed by a single MVC to handle the excess (if any). This loop will also be needed in future when support for variable lengths is added. Because the same tablegen classes are used to define MVC and CLC, the patch also has the side-effect of defining a pseudo loop instruction for CLC. That instruction isn't used yet (and wouldn't be handled correctly if it were). I'm planning to use it soon though. llvm-svn: 189331
* [SystemZ] Add basic prefetch supportRichard Sandiford2013-08-231-0/+87
| | | | | | Just the instructions and intrinsics for now. llvm-svn: 189100
* [SystemZ] Try reversing comparisons whose first operand is in memoryRichard Sandiford2013-08-2318-2/+430
| | | | | | This allows us to make more use of the many compare reg,mem instructions. llvm-svn: 189099
* [SystemZ] Prefer LHI;ST... over LAY;MV...Richard Sandiford2013-08-236-48/+49
| | | | | | | | | | | | | | | | | | | If we had a store of an integer to memory, and the integer and store size were suitable for a form of MV..., we used MV... no matter what. We could then have sequences like: lay %r2, 0(%r3,%r4) mvi 0(%r2), 4 In these cases it seems better to force the constant into a register and use a normal store: lhi %r2, 4 stc %r2, 0(%r3, %r4) since %r2 is more likely to be hoisted and is easier to rematerialize. llvm-svn: 189098
* Turn MipsOptimizeMathLibCalls into a target-independent scalar transformRichard Sandiford2013-08-232-1/+31
| | | | | | | | | | ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097
* [SystemZ] Define remainig *MUL_LOHI patternsRichard Sandiford2013-08-212-3/+63
| | | | | | | | | | | | | | | | | The initial port used MLG(R) for i64 UMUL_LOHI but left the other three combinations as not-legal-or-custom. Although 32x32->{32,32} multiplications exist, they're not as quick as doing a normal 64-bit multiplication, so it didn't seem like i32 SMUL_LOHI and UMUL_LOHI would be useful. There's also no direct instruction for i64 SMUL_LOHI, so it needs to be implemented in terms of UMUL_LOHI. However, not defining these patterns means that we don't convert division by a constant into multiplication, so this patch fills in the other cases. The new i64 SMUL_LOHI sequence is simpler than the one that we used previously for 64x64->128 multiplication, so int-mul-08.ll now tests the full sequence. llvm-svn: 188898
* [SystemZ] Use FI[EDX]BRA for codegenRichard Sandiford2013-08-212-6/+315
| | | | llvm-svn: 188895
* [SystemZ] Use SRST to optimize memchrRichard Sandiford2013-08-202-0/+78
| | | | | | | | | | | | | | | | | | | SystemZTargetLowering::emitStringWrapper() previously loaded the character into R0 before the loop and made R0 live on entry. I'd forgotten that allocatable registers weren't allowed to be live across blocks at this stage, and it confused LiveVariables enough to cause a miscompilation of f3 in memchr-02.ll. This patch instead loads R0 in the loop and leaves LICM to hoist it after RA. This is actually what I'd tried originally, but I went for the manual optimisation after noticing that R0 often wasn't being hoisted. This bug forced me to go back and look at why, now fixed as r188774. We should also try to optimize null checks so that they test the CC result of the SRST directly. The select between null and the SRST GPR result could then usually be deleted as dead. llvm-svn: 188779
* Fix test typo and add usual "br %r14" testRichard Sandiford2013-08-201-1/+2
| | | | llvm-svn: 188775
* Fix overly pessimistic shortcut in post-RA MachineLICMRichard Sandiford2013-08-201-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Post-RA LICM keeps three sets of registers: PhysRegDefs, PhysRegClobbers and TermRegs. When it sees a definition of R it adds all aliases of R to the corresponding set, so that when it needs to test for membership it only needs to test a single register, rather than worrying about aliases there too. E.g. the final candidate loop just has: unsigned Def = Candidates[i].Def; if (!PhysRegClobbers.test(Def) && ...) { to test whether register Def is multiply defined. However, there was also a shortcut in ProcessMI to make sure we didn't add candidates if we already knew that they would fail the final test. This shortcut was more pessimistic than the final one because it checked whether _any alias_ of the defined register was multiply defined. This is too conservative for targets that define register pairs. E.g. on z, R0 and R1 are sometimes used as a pair, so there is a 128-bit register that aliases both R0 and R1. If a loop used R0 and R1 independently, and the definition of R0 came first, we would be able to hoist the R0 assignment (because that used the final test quoted above) but not the R1 assignment (because that meant we had two definitions of the paired R0/R1 register and would fail the shortcut in ProcessMI). This patch just uses the same check for the ProcessMI shortcut as we use in the final candidate loop. llvm-svn: 188774
* [SystemZ] Add negative integer absolute (load negative)Richard Sandiford2013-08-191-0/+91
| | | | | | | | For now this matches the equivalent of (neg (abs ...)), which did hit a few times in projects/test-suite. We should probably also match cases where absolute-like selects are used with reversed arguments. llvm-svn: 188671
* [SystemZ] Add integer absolute (load positive)Richard Sandiford2013-08-191-0/+83
| | | | llvm-svn: 188670
* [SystemZ] Add support for sibling callsRichard Sandiford2013-08-193-154/+125
| | | | | | | | | | | | | | | | | | This first cut is pretty conservative. The final argument register (R6) is call-saved, so we would need to make sure that the R6 argument to a sibling call is the same as the R6 argument to the calling function, which seems worth keeping as a separate patch. Saying that integer truncations are free means that we no longer use the extending instructions LGF and LLGF for spills in int-conv-09.ll and int-conv-10.ll. Instead we treat the registers as 64 bits wide and truncate them to 32-bits where necessary. I think it's unlikely we'd use LGF and LLGF for spills in other situations for the same reason, so I'm removing the tests rather than replacing them. The associated code is generic and applies to many more instructions than just LGF and LLGF, so there is no corresponding code removal. llvm-svn: 188669
* [SystemZ] Use SRST to implement strlen and strnlenRichard Sandiford2013-08-162-0/+78
| | | | | | It would also make sense to use it for memchr; I'm working on that now. llvm-svn: 188547
* [SystemZ] Use MVST to implement strcpy and stpcpyRichard Sandiford2013-08-161-0/+50
| | | | llvm-svn: 188546
* [SystemZ] Use CLST to implement strcmpRichard Sandiford2013-08-162-0/+142
| | | | llvm-svn: 188544
* [SystemZ] Fix handling of 64-bit memcmp resultsRichard Sandiford2013-08-162-1/+136
| | | | | | | | | | | | | Generalize r188163 to cope with return types other than MVT::i32, just as the existing visitMemCmpCall code did. I've split this out into a subroutine so that it can be used for other upcoming patches. I also noticed that I'd used the wrong API to record the out chain. It's a load that uses DAG.getRoot() rather than getRoot(), so the out chain should go on PendingLoads. I don't have a testcase for that because we don't do any interesting scheduling on z yet. llvm-svn: 188540
* [SystemZ] Fix sign of integer memcmp resultRichard Sandiford2013-08-161-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r188163 used CLC to implement memcmp. Code that compares the result directly against zero can test the CC value produced by CLC, but code that needs an integer result must use IPM. The sequence I'd used was: ipm <reg> sll <reg>, 2 sra <reg>, 30 but I'd forgotten that this inverts the order, so that CC==1 ("less") becomes an integer greater than zero, and CC==2 ("greater") becomes an integer less than zero. This sequence should only be used if the CLC arguments are reversed to compensate. The problem then is that the branch condition must also be reversed when testing the CLC result directly. Rather than do that, I went for a different sequence that works with the natural CLC order: ipm <reg> srl <reg>, 28 rll <reg>, <reg>, 31 One advantage of this is that it doesn't clobber CC. A disadvantage is that any sign extension to 64 bits must be done separately, rather than being folded into the shifts. llvm-svn: 188538
* [tests] Cleanup initialization of test suffixes.Daniel Dunbar2013-08-161-2/+0
| | | | | | | | | | | | | | | | | - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513
* [SystemZ] Use CLC and IPM to implement memcmpRichard Sandiford2013-08-121-0/+134
| | | | | | | For now this is restricted to fixed-length comparisons with a length in the range [1, 256], as for memcpy() and MVC. llvm-svn: 188163
* [SystemZ] Optimize floating-point comparisons with zeroRichard Sandiford2013-08-071-0/+348
| | | | | | | | | This follows the same lines as the integer code. In the end it seemed easier to have a second 4-bit mask in TSFlags to specify the compare-like CC values. That eats one more TSFlags bit than adding a CCHasUnordered would have done, but it feels more concise. llvm-svn: 187883
* [SystemZ] Add floating-point load-and-test instructionsRichard Sandiford2013-08-073-0/+39
| | | | | | These instructions can also be used as comparisons with zero. llvm-svn: 187882
* [SystemZ] Use BRCT and BRCTG to eliminate add-&-compare sequencesRichard Sandiford2013-08-053-1/+237
| | | | | | | | | | | | | | | | This patch just uses a peephole test for "add; compare; branch" sequences within a single block. The IR optimizers already convert loops to decrement-and-branch-on-nonzero form in some cases, so even this simplistic test triggers many times during a clang bootstrap and projects/test-suite run. It looks like there are still cases where we need to more strongly prefer branches on nonzero though. E.g. I saw a case where a loop that started out with a check for 0 ended up with a check for -1. I'll try to look at that sometime. I ended up adding the Reference class because MachineInstr::readsRegister() doesn't check for subregisters (by design, as far as I could tell). llvm-svn: 187723
* [SystemZ] Use LOAD AND TEST to eliminate comparisons against zeroRichard Sandiford2013-08-051-0/+223
| | | | llvm-svn: 187720
* [SystemZ] Reuse CC results for integer comparisons with zeroRichard Sandiford2013-08-012-0/+691
| | | | | | | | | | This also fixes a bug in the predication of LR to LOCR: I'd forgotten that with these in-place instruction builds, the implicit operands need to be added manually. I think this was latent until now, but is tested by int-cmp-45.c. It also adds a CC valid mask to STOC, again tested by int-cmp-45.c. llvm-svn: 187573
* [SystemZ] Prefer comparisons with zeroRichard Sandiford2013-08-015-10/+54
| | | | | | | Convert >= 1 to > 0, etc. Using comparison with zero isn't a win on its own, but it exposes more opportunities for CC reuse (the next patch). llvm-svn: 187571
OpenPOWER on IntegriCloud