summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [SystemZ] Use A(G)SI when spilling the target of a constant additionRichard Sandiford2013-10-151-2/+24
| | | | llvm-svn: 192681
* [SystemZ] Add comparisons of high words and memoryRichard Sandiford2013-10-011-0/+8
| | | | llvm-svn: 191777
* [SystemZ] Add comparisons of large immediates using high wordsRichard Sandiford2013-10-011-0/+8
| | | | | | | There are no corresponding patterns for small immediates because they would prevent the use of fused compare-and-branch instructions. llvm-svn: 191775
* [SystemZ] Add immediate addition involving high wordsRichard Sandiford2013-10-011-2/+50
| | | | llvm-svn: 191774
* [SystemZ] Extend test-under-mask support to high GR32sRichard Sandiford2013-10-011-0/+8
| | | | llvm-svn: 191773
* [SystemZ] Allow integer AND involving high wordsRichard Sandiford2013-10-011-33/+50
| | | | llvm-svn: 191762
* [SystemZ] Allow integer XOR involving high wordsRichard Sandiford2013-10-011-0/+4
| | | | llvm-svn: 191759
* [SystemZ] Allow integer OR involving high wordsRichard Sandiford2013-10-011-0/+12
| | | | llvm-svn: 191755
* [SystemZ] Allow integer insertions with a high-word destinationRichard Sandiford2013-10-011-0/+8
| | | | llvm-svn: 191753
* [SystemZ] Add patterns to load a constant into a high word (IIHF)Richard Sandiford2013-10-011-0/+24
| | | | | | | Similar to low words, we can use the shorter LLIHL and LLIHH if it turns out that the other half of the GR64 isn't live. llvm-svn: 191750
* [SystemZ] Add register zero extensions involving at least one high wordRichard Sandiford2013-10-011-0/+19
| | | | llvm-svn: 191746
* [SystemZ] Add truncating high-word stores (STCH and STHH)Richard Sandiford2013-10-011-0/+8
| | | | llvm-svn: 191743
* [SystemZ] Add zero-extending high-word loads (LLCH and LLHH)Richard Sandiford2013-10-011-0/+8
| | | | llvm-svn: 191742
* [SystemZ] Add sign-extending high-word loads (LBH and LHH)Richard Sandiford2013-10-011-0/+8
| | | | llvm-svn: 191740
* [SystemZ] Use upper words of GR64s for codegenRichard Sandiford2013-10-011-5/+73
| | | | | | | | | | | | | | This just adds the basics necessary for allocating the upper words to virtual registers (move, load and store). The move support is parameterised in a way that makes it easy to handle zero extensions, but the associated zero-extend patterns are added by a later patch. The easiest way of testing this seemed to be add a new "h" register constraint for high words. I don't expect the constraint to be useful in real inline asms, but it should work, so I didn't try to hide it behind an option. llvm-svn: 191739
* [SystemZ] Rename subregs and add subreg_h32Richard Sandiford2013-09-301-6/+6
| | | | | | | | | | | | | Use subreg_hNN and subreg_lNN for the high and low NN bits of a register. List the low registers first, so that subreg_l32 also means the low 32 bits of a 128-bit register. Floats are stored in the upper 32 bits of a 64-bit register, so they should use subreg_h32 rather than subreg_l32. No behavioral change intended. llvm-svn: 191659
* [SystemZ] Rein back the use of block operationsRichard Sandiford2013-09-271-4/+8
| | | | | | | | | | | | | | The backend tries to use block operations like MVC, NC, OC and XC for simple scalar operations. For correctness reasons, it rejects any case in which the regions might partially overlap. However, for performance reasons, it should also reject cases where the regions might be equal, since the instruction might then not use the fast path. This fixes a performance regression seen in bzip2. We may want to limit the optimisation even more in future, or even remove it entirely, but I'll try with this for now. llvm-svn: 191525
* [SystemZ] Define the GR64 low-word logic instructions as pseudo aliases.Richard Sandiford2013-09-251-6/+6
| | | | | | | | Another patch to avoid duplication of encoding information. Things like NILF, NILL and NILH are used as both 32-bit and 64-bit instructions. Here the 64-bit versions are defined as aliases of the 32-bit ones. llvm-svn: 191369
* [SystemZ] Use subregs for 64-bit truncating storesRichard Sandiford2013-09-251-1/+1
| | | | | | | | Another patch to reduce the duplication of encoding information. Rather than define separate patterns for truncating 64-bit stores, use the 32-bit stores with a subreg. No behavioral changed intended. llvm-svn: 191365
* [SystemZ] Add unsigned compare-and-branch instructionsRichard Sandiford2013-09-181-0/+18
| | | | | | | | | | | | | | | For some reason I never got around to adding these at the same time as the signed versions. No idea why. I'm not sure whether this SystemZII::BranchC* stuff is useful, or whether it should just be replaced with an "is normal" flag. I'll leave that for later though. There are some boundary conditions that can be tweaked, such as preferring unsigned comparisons for equality with [128, 256), and "<= 255" over "< 256", but again I'll leave those for a separate patch. llvm-svn: 190930
* [SystemZ] Fix handling of 64-bit memcmp resultsRichard Sandiford2013-08-161-0/+7
| | | | | | | | | | | | | Generalize r188163 to cope with return types other than MVT::i32, just as the existing visitMemCmpCall code did. I've split this out into a subroutine so that it can be used for other upcoming patches. I also noticed that I'd used the wrong API to record the out chain. It's a load that uses DAG.getRoot() rather than getRoot(), so the out chain should go on PendingLoads. I don't have a testcase for that because we don't do any interesting scheduling on z yet. llvm-svn: 188540
* [SystemZ] Fix sign of integer memcmp resultRichard Sandiford2013-08-161-29/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r188163 used CLC to implement memcmp. Code that compares the result directly against zero can test the CC value produced by CLC, but code that needs an integer result must use IPM. The sequence I'd used was: ipm <reg> sll <reg>, 2 sra <reg>, 30 but I'd forgotten that this inverts the order, so that CC==1 ("less") becomes an integer greater than zero, and CC==2 ("greater") becomes an integer less than zero. This sequence should only be used if the CLC arguments are reversed to compensate. The problem then is that the branch condition must also be reversed when testing the CLC result directly. Rather than do that, I went for a different sequence that works with the natural CLC order: ipm <reg> srl <reg>, 28 rll <reg>, <reg>, 31 One advantage of this is that it doesn't clobber CC. A disadvantage is that any sign extension to 64 bits must be done separately, rather than being folded into the shifts. llvm-svn: 188538
* [SystemZ] Use CLC and IPM to implement memcmpRichard Sandiford2013-08-121-0/+93
| | | | | | | For now this is restricted to fixed-length comparisons with a length in the range [1, 256], as for memcpy() and MVC. llvm-svn: 188163
* [SystemZ] Optimize floating-point comparisons with zeroRichard Sandiford2013-08-071-0/+3
| | | | | | | | | This follows the same lines as the integer code. In the end it seemed easier to have a second 4-bit mask in TSFlags to specify the compare-like CC values. That eats one more TSFlags bit than adding a CCHasUnordered would have done, but it feels more concise. llvm-svn: 187883
* [SystemZ] Use BRCT and BRCTG to eliminate add-&-compare sequencesRichard Sandiford2013-08-051-0/+8
| | | | | | | | | | | | | | | | This patch just uses a peephole test for "add; compare; branch" sequences within a single block. The IR optimizers already convert loops to decrement-and-branch-on-nonzero form in some cases, so even this simplistic test triggers many times during a clang bootstrap and projects/test-suite run. It looks like there are still cases where we need to more strongly prefer branches on nonzero though. E.g. I saw a case where a loop that started out with a check for 0 ended up with a check for -1. I'll try to look at that sometime. I ended up adding the Reference class because MachineInstr::readsRegister() doesn't check for subregisters (by design, as far as I could tell). llvm-svn: 187723
* [SystemZ] Use LOAD AND TEST to eliminate comparisons against zeroRichard Sandiford2013-08-051-0/+13
| | | | llvm-svn: 187720
* [SystemZ] Reuse CC results for integer comparisons with zeroRichard Sandiford2013-08-011-1/+2
| | | | | | | | | | This also fixes a bug in the predication of LR to LOCR: I'd forgotten that with these in-place instruction builds, the implicit operands need to be added manually. I think this was latent until now, but is tested by int-cmp-45.c. It also adds a CC valid mask to STOC, again tested by int-cmp-45.c. llvm-svn: 187573
* [SystemZ] Be more careful about inverting CC masksRichard Sandiford2013-07-311-22/+30
| | | | | | | | | | | | | | | | | | | | | | | | System z branches have a mask to select which of the 4 CC values should cause the branch to be taken. We can invert a branch by inverting the mask. However, not all instructions can produce all 4 CC values, so inverting the branch like this can lead to some oddities. For example, integer comparisons only produce a CC of 0 (equal), 1 (less) or 2 (greater). If an integer EQ is reversed to NE before instruction selection, the branch will test for 1 or 2. If instead the branch is reversed after instruction selection (by inverting the mask), it will test for 1, 2 or 3. Both are correct, but the second isn't really canonical. This patch therefore keeps track of which CC values are possible and uses this when inverting a mask. Although this is mostly cosmestic, it fixes undefined behavior for the CIJNLH in branch-08.ll. Another fix would have been to mask out bit 0 when generating the fused compare and branch, but the point of this patch is that we shouldn't need to do that in the first place. The patch also makes it easier to reuse CC results from other instructions. llvm-svn: 187495
* [SystemZ] Move compare-and-branch generation even laterRichard Sandiford2013-07-311-103/+0
| | | | | | | | | | | | | | | | | | | | | | | r187116 moved compare-and-branch generation from the instruction-selection pass to the peephole optimizer (via optimizeCompare). It turns out that even this is a bit too early. Fused compare-and-branch instructions don't interact well with predication, where a CC result is needed. They also make it harder to reuse the CC side-effects of earlier instructions (not yet implemented, but the subject of a later patch). Another problem was that the AnalyzeBranch family of routines weren't handling compares and branches, so we weren't able to reverse the fused form in cases where we would reverse a separate branch. This could have been fixed by extending AnalyzeBranch, but given the other problems, I've instead moved the fusing to the long-branch pass, which is also responsible for the opposite transformation: splitting out-of-range compares and branches into separate compares and long branches. I've added a test for the AnalyzeBranch problem. A test for the predication problem is included in the next patch, which fixes a bug in the choice of CC mask. llvm-svn: 187494
* [SystemZ] Postpone NI->RISBG conversion to convertToThreeAddress()Richard Sandiford2013-07-311-13/+127
| | | | | | | | | | | | | | | | | | | | | | r186399 aggressively used the RISBG instruction for immediate ANDs, both because it can handle some values that AND IMMEDIATE can't, and because it allows the destination register to be different from the source. I realized later while implementing the distinct-ops support that it would be better to leave the choice up to convertToThreeAddress() instead. The AND IMMEDIATE form is shorter and is less likely to be cracked. This is a problem for 32-bit ANDs because we assume that all 32-bit operations will leave the high word untouched, whereas RISBG used in this way will either clear the high word or copy it from the source register. The patch uses the z196 instruction RISBLG for this instead. This means that z10 will be restricted to NILL, NILH and NILF for 32-bit ANDs, but I think that should be OK for now. Although we're using z10 as the base architecture, the optimization work is going to be focused more on z196 and zEC12. llvm-svn: 187492
* [SystemZ] Rework compare and branch supportRichard Sandiford2013-07-251-0/+103
| | | | | | | | | | | | | | Before the patch we took advantage of the fact that the compare and branch are glued together in the selection DAG and fused them together (where possible) while emitting them. This seemed to work well in practice. However, fusing the compare so early makes it harder to remove redundant compares in cases where CC already has a suitable value. This patch therefore uses the peephole analyzeCompare/optimizeCompareInstr pair of functions instead. No behavioral change intended, but it paves the way for a later patch. llvm-svn: 187116
* [SystemZ] Add LOCR and LOCGRRichard Sandiford2013-07-251-0/+52
| | | | llvm-svn: 187113
* [SystemZ] Use SLLK, SRLK and SRAK for codegenRichard Sandiford2013-07-191-2/+45
| | | | | | This patch uses the instructions added in r186680 for codegen. llvm-svn: 186681
* [SystemZ] Improve spilling of LGDR and LDGRRichard Sandiford2013-07-121-1/+23
| | | | | | | If the source of these instructions is spilled we should load the destination. If the destination is spilled we should store the source. llvm-svn: 186147
* [SystemZ] Remove no-op MVCsRichard Sandiford2013-07-051-0/+25
| | | | | | | | | | | The stack coloring pass has code to delete stores and loads that become trivially dead after coloring. Extend it to cope with single instructions that copy from one frame index to another. The testcase happens to show an example of this kicking in at the moment. It did occur in Real Code too though. llvm-svn: 185705
* [SystemZ] Remove redundant frame MMOsRichard Sandiford2013-07-051-24/+4
| | | | | | | | | | This fixes foldMemoryOperandImpl() so that it doesn't create duplicated frame MMOs. I hadn't realized when writing r185434 that it was the caller's responsibility to add these. No behavioural change intended. llvm-svn: 185704
* [SystemZ] Enable the use of MVC for frame-to-frame spillsRichard Sandiford2013-07-051-10/+2
| | | | | | | | | | ...now that the problem that prompted the restriction has been fixed. The original spill-02.py was a compromise because at the time I couldn't find an example that actually failed without the two scavenging slots. The version included here did. llvm-svn: 185701
* [SystemZ] Fold more spillsRichard Sandiford2013-07-031-0/+24
| | | | | | | | | | | | | | | Add a mapping from register-based <INSN>R instructions to the corresponding memory-based <INSN>. Use it to cut down on the number of spill loads. Some instructions extend their operands from smaller fields, so this required a new TSFlags field to say how big the unextended operand is. This optimisation doesn't trigger for C(G)R and CL(G)R because in practice we always combine those instructions with a branch. Adding a test for every other case probably seems excessive, but it did catch a missed optimisation for DSGF (fixed in r185435). llvm-svn: 185529
* SystemZInstrInfo.cpp: Tweak an assertion. [-Wunused-variable]NAKAMURA Takumi2013-07-031-2/+2
| | | | llvm-svn: 185499
* SystemZ: Fold variable into assertion.Benjamin Kramer2013-07-021-2/+2
| | | | llvm-svn: 185475
* [SystemZ] Use MVC to spill loads and storesRichard Sandiford2013-07-021-1/+93
| | | | | | | | | | | | | | | | | | | | | Try to use MVC when spilling the destination of a simple load or the source of a simple store. As explained in the comment, this doesn't yet handle the case where the load or store location is also a frame index, since that could lead to two simultaneous scavenger spills, something the backend can't handle yet. spill-02.py tests that this restriction kicks in, but unfortunately I've not yet found a case that would fail without it. The volatile trick I used for other scavenger tests doesn't work here because we can't use MVC for volatile accesses anyway. I'm planning on relaxing the restriction later, hopefully with a test that does trigger the problem... Tests @f8 and @f9 also showed that L(G)RL and ST(G)RL were wrongly classified as SimpleBDX{Load,Store}. It wouldn't be easy to test for that bug separately, which is why I didn't split out the fix as a separate patch. llvm-svn: 185434
* Don't cache the instruction and register info from the TargetMachine, becauseBill Wendling2013-06-071-1/+1
| | | | | | | | the internals of TargetMachine could change. No functionality change intended. llvm-svn: 183567
* [SystemZ] Immediate compare-and-branch supportRichard Sandiford2013-05-291-1/+8
| | | | | | This patch adds support for the CIJ and CGIJ instructions. llvm-svn: 182846
* [SystemZ] Register compare-and-branch supportRichard Sandiford2013-05-281-2/+27
| | | | | | | | | | | | | | This patch adds support for the CRJ and CGRJ instructions. Support for the immediate forms will be a separate patch. The architecture has a large number of comparison instructions. I think it's generally better to concentrate on using the "best" comparison instruction first and foremost, then only use something like CRJ if CR really was the natual choice of comparison instruction. The patch therefore opportunistically converts separate CR and BRC instructions into a single CRJ while emitting instructions in ISelLowering. llvm-svn: 182764
* [SystemZ] Tweak SystemZInstrInfo::isBranch() interfaceRichard Sandiford2013-05-281-26/+18
| | | | | | | This is needed for the upcoming compare-and-branch patch. No functional change intended. llvm-svn: 182762
* [SystemZ] Add long branch passRichard Sandiford2013-05-201-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change, the SystemZ backend would use BRCL for all branches and only consider shortening them to BRC when generating an object file. E.g. a branch on equal would use the JGE alias of BRCL in assembly output, but might be shortened to the JE alias of BRC in ELF output. This was a useful first step, but it had two problems: (1) The z assembler isn't traditionally supposed to perform branch shortening or branch relaxation. We followed this rule by not relaxing branches in assembler input, but that meant that generating assembly code and then assembling it would not produce the same result as going directly to object code; the former would give long branches everywhere, whereas the latter would use short branches where possible. (2) Other useful branches, like COMPARE AND BRANCH, do not have long forms. We would need to do something else before supporting them. (Although COMPARE AND BRANCH does not change the condition codes, the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction during codegen, so that we can safely lower it to a separate compare and long branch where necessary. This is not a valid transformation for the assembler proper to make.) This patch therefore moves branch relaxation to a pre-emit pass. For now, calls are still shortened from BRASL to BRAS by the assembler, although this too is not really the traditional behaviour. The first test takes about 1.5s to run, and there are likely to be more tests in this vein once further branch types are added. The feeling on IRC was that 1.5s is a bit much for a single test, so I've restricted it to SystemZ hosts for now. The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests. A later patch will remove the {{g}}s from that directory. llvm-svn: 182274
* [SystemZ] Add back endUlrich Weigand2013-05-061-0/+444
| | | | | | | | | | | | | | This adds the actual lib/Target/SystemZ target files necessary to implement the SystemZ target. Note that at this point, the target cannot yet be built since the configure bits are missing. Those will be provided shortly by a follow-on patch. This version of the patch incorporates feedback from reviews by Chris Lattner and Anton Korobeynikov. Thanks to all reviewers! Patch by Richard Sandiford. llvm-svn: 181203
* Remove the SystemZ backend.Dan Gohman2011-10-241-439/+0
| | | | llvm-svn: 142878
* Move TargetRegistry and TargetSelect from Target to Support where they belong.Evan Cheng2011-08-241-1/+1
| | | | | | These are strictly utilities for registering targets and components. llvm-svn: 138450
* Next round of MC refactoring. This patch factor MC table instantiations, MCEvan Cheng2011-07-141-12/+0
| | | | | | registeration and creation code into XXXMCDesc libraries. llvm-svn: 135184
OpenPOWER on IntegriCloud