bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SystemZ] Provide basic TargetTransformInfo implementation	Ulrich Weigand	2015-03-31	2	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \|	This hooks up the TargetTransformInfo machinery for SystemZ, and provides an implementation of getIntImmCost. In addition, the patch adds the isLegalICmpImmediate and isLegalAddImmediate TargetLowering overrides, and updates a couple of test cases where we now generate slightly better code. llvm-svn: 233688
*	[SystemZ] Fix LLVM crash on unoptimized code	Ulrich Weigand	2015-03-30	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compiling the following function with -O0 would crash, since LLVM would hit an assertion in getTestUnderMaskCond: int test(unsigned long x) { return x >= 0 && x <= 15; } Fixed by detecting the case in the caller of getTestUnderMaskCond. llvm-svn: 233541
*	Change SystemZ large tests to use the existing long_tests property	David Blaikie	2015-03-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	(this is already used in Clang for a couple of tests) Reviewers: uweigand Differential Revision: http://reviews.llvm.org/D7965 llvm-svn: 230998
*	Update SystemZ/Large test generators to handle new gep IR syntax	David Blaikie	2015-02-27	14	-27/+27
\| \| \| \|	llvm-svn: 230810
*	Update SystemZ/Large test generators to handle new load IR syntax	David Blaikie	2015-02-27	14	-27/+27
\| \| \| \|	llvm-svn: 230809
*	[opaque pointer type] Add textual IR support for explicit type parameter to ↵	David Blaikie	2015-02-27	204	-2806/+2806
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
*	[opaque pointer type] Add textual IR support for explicit type parameter to ↵	David Blaikie	2015-02-27	152	-1308/+1308
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
*	[SystemZ] Support all TLS access models - CodeGen part	Ulrich Weigand	2015-02-18	7	-3/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current SystemZ back-end only supports the local-exec TLS access model. This patch adds all required CodeGen support for the other TLS models, which means in particular: - Expand initial-exec TLS accesses by loading TLS offsets from the GOT using @indntpoff relocations. - Expand general-dynamic and local-dynamic accesses by generating the appropriate calls to __tls_get_offset. Note that this routine has a non-standard ABI and requires loading the GOT pointer into %r12, so the patch also adds support for the GLOBAL_OFFSET_TABLE ISD node. - Add a new platform-specific optimization pass to remove redundant __tls_get_offset calls in the local-dynamic model (modeled after the corresponding X86 pass). - Add test cases verifying all access models and optimizations. llvm-svn: 229654
*	Use the integrated assembler as default on SystemZ	Ulrich Weigand	2015-01-13	22	-22/+22
\| \| \| \| \| \| \| \| \| \|	This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases deliberately using an invalid instruction in inline asm now have to use -no-integrated-as. llvm-svn: 225820
*	IR: Make metadata typeless in assembly	Duncan P. N. Exon Smith	2014-12-15	3	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
*	IR: add "cmpxchg weak" variant to support permitted failure.	Tim Northover	2014-06-13	4	-25/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
*	Reduce verbiage of lit.local.cfg files	Alp Toker	2014-06-09	2	-4/+2
\| \| \| \| \| \|	We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496
*	Reenable use of TBAA during CodeGen	Hal Finkel	2014-04-12	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had disabled use of TBAA during CodeGen (even when otherwise using AA) because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to miss basic type punning that it should catch (and, thus, we'd fail to override TBAA when we should). However, when AA is in use during CodeGen, CGP now uses normal GEPs and bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a result, BasicAA should be able to make us do the right thing in the face of type-punning, and it seems safe to enable use of TBAA again. self-hosting seems fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle. Note: We still don't update TBAA when merging stack slots, although because BasicAA should now catch all such cases, this is no longer a blocking issue. Nevertheless, I plan to commit code to deal with this properly in the near future. llvm-svn: 206093
*	[SystemZ] Add support for z196 float<->unsigned conversions	Richard Sandiford	2014-03-21	6	-8/+135
\| \| \| \| \| \|	These complement the older float<->signed instructions. llvm-svn: 204451
*	IR: add a second ordering operand to cmpxhg for failure	Tim Northover	2014-03-11	4	-25/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
*	Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵	Daniel Sanders	2014-02-13	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333
*	XFAIL test/CodeGen/SystemZ/alias-01.ll which requires CodeGen TBAA	Hal Finkel	2014-01-25	1	-0/+3
\| \| \| \|	llvm-svn: 200094
*	Fix known typos	Alp Toker	2014-01-24	4	-5/+5
\| \| \| \| \| \| \|	Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
*	[SystemZ] Flesh out stackrestore test (frame-11.ll)	Richard Sandiford	2014-01-13	1	-3/+10
\| \| \| \| \| \|	...so that it does something vaguely sensible. llvm-svn: 199117
*	[SystemZ] Improve risbg-01.ll test	Richard Sandiford	2014-01-13	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old mask in f24 wasn't well chosen because the lshr would always be zero. CodeGen didn't detect this but InstCombine would. The new mask ensures that both shifts are needed. f26 is specifically testing for a wrap-around mask. The AND can be applied to just the shift left, either before or after the shift. Again, CodeGen kept it in the original form but InstCombine would mask after the shift instead. The exact choice of NILF isn't important for the test so I just dropped it and kept the rotate. llvm-svn: 199115
*	[SystemZ] Optimize (sext (ashr (shl ...), ...))	Richard Sandiford	2014-01-13	1	-3/+16
\| \| \| \| \| \| \| \| \| \|	...into (ashr (shl (anyext X), ...), ...), which requires one fewer instruction. The (anyext X) can sometimes be simplified too. I didn't do this in DAGCombiner because widening shifts isn't a win on all targets. llvm-svn: 199114
*	[SystemZ] Fix RNSBG bug introduced by r197802	Richard Sandiford	2014-01-09	3	-0/+33
\| \| \| \| \| \| \| \|	The zext handling added in r197802 wasn't right for RNSBG. This patch restricts it to ROSBG, RXSBG and RISBG. (The tests for RISBG were added in r197802 since RISBG was the motivating example.) llvm-svn: 198862
*	Handle masked rotate amounts	Richard Sandiford	2014-01-09	1	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment we expect rotates to have the form: (or (shl X, Y), (shr X, Z)) where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C (including 0, the most natural choice). llvm-svn: 198861
*	Match the InstCombine form of rotates by X+C	Richard Sandiford	2014-01-09	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X), so a rotate left of a 32-bit Y by X+C could appear as either: (or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C)))) without InstCombine or: (or (shl Y, (add X, C)), (shr Y, (sub 32-C, X))) with it. We already matched the first form. This patch handles the second too. llvm-svn: 198860
*	[SystemZ] Use interlocked-access 1 instructions for CodeGen	Richard Sandiford	2013-12-24	10	-0/+650
\| \| \| \| \| \| \| \| \|	...namely LOAD AND ADD, LOAD AND AND, LOAD AND OR and LOAD AND EXCLUSIVE OR. LOAD AND ADD LOGICAL isn't really separately useful for LLVM. I'll look at adding reusing the CC results in new year. llvm-svn: 197985
*	[SystemZ] Optimize comparisons with truncated extended loads	Richard Sandiford	2013-12-20	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \|	If the extension of a loaded value is compared against zero and used in other arithmetic, InstCombine will change the comparison to use the unextended load. It's also possible that the comparison could be against the unextended load from the outset. In DAG form this becomes a truncation of an extending load. We want to strip the truncation if possible so that we can use load-and-test instructions. llvm-svn: 197804
*	[SystemZ] Extend RISBG optimization	Richard Sandiford	2013-12-20	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \|	The handling of ANY_EXTEND and ZERO_EXTEND was too strict. In this context we can treat ZERO_EXTEND in much the same way as an AND and then also handle outermost ZERO_EXTENDs. I couldn't find a test that benefited from the ANY_EXTEND change, but it's more obvious to write it this way once SIGN_EXTEND and ZERO_EXTEND are handled differently. llvm-svn: 197802
*	Revert "Add -mcpu=z10 to SystemZ tests."	Andrew Trick	2013-12-18	4	-4/+4
\| \| \| \| \| \| \| \| \| \|	This reverts commit r197466. The MachineCSE fix that required the -mcpu flag has been disabled until more work can be done to fix downstream issues. Adding -mcpu wasn't the right workaround anyway. llvm-svn: 197624
*	Add -mcpu=z10 to SystemZ tests.	Andrew Trick	2013-12-17	4	-4/+4
\| \| \| \|	llvm-svn: 197466
*	[SystemZ] Optimize X [!=]= Y in cases where X - Y or Y - X is also computed	Richard Sandiford	2013-12-13	1	-0/+20
\| \| \| \| \| \| \|	In those cases it's better to compare the result of the subtraction against zero. llvm-svn: 197239
*	[SystemZ] Make more use of TMHH	Richard Sandiford	2013-12-13	1	-0/+109
\| \| \| \| \| \| \| \| \| \|	This originally came about after noticing that InstCombine turns some of the TMHH (icmp (and...), ...) tests into plain comparisons. Since there is no instruction to compare with a 64-bit immediate, TMHH is generally better than an ordered comparison for the cases that it can handle. llvm-svn: 197238
*	[SystemZ] Extend integer absolute selection	Richard Sandiford	2013-12-13	2	-0/+197
\| \| \| \| \| \| \| \|	This patch makes more use of LPGFR and LNGFR. It builds on top of the LTGFR selection from r197234. Most of the tests are motivated by what InstCombine would produce. llvm-svn: 197236
*	[SystemZ] Make more use of LTGFR	Richard Sandiford	2013-12-13	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \|	InstCombine turns (sext (trunc)) into (ashr (shl)), then converts any comparison of the ashr against zero into a comparison of the shl against zero. This makes sense in itself, but we want to undo it for z, since the sign- extension instruction has a CC-setting form. I've included tests for both the original and InstCombined variants, but the former already worked. The patch fixes the latter. llvm-svn: 197234
*	[SystemZ] Optimize fcmp X, 0 in cases where X is also negated	Richard Sandiford	2013-12-11	1	-0/+40
\| \| \| \| \| \| \|	In such cases it's often better to test the result of the negation instead, since the negation also sets CC. llvm-svn: 197032
*	Extend (truncate (load)) folding	Richard Sandiford	2013-12-11	1	-0/+14
\| \| \| \| \| \| \| \| \|	DAGCombiner could fold (truncate (load)) -> smaller load if the original load was the width of the truncation result or wider. This patch extends it to handle cases where the original load was narrower (and so the extension type stays the same). llvm-svn: 197030
*	Add TargetLowering::prepareVolatileOrAtomicLoad	Richard Sandiford	2013-12-10	12	-46/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196906
*	Add TargetLowering::prepareVolatileOrAtomicLoad	Richard Sandiford	2013-12-10	14	-47/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196905
*	[SystemZ] Use LOAD AND TEST for comparisons with -0	Richard Sandiford	2013-12-06	1	-0/+19
\| \| \| \| \| \|	...since it os equivalent to comparison with +0. llvm-svn: 196580
*	[SystemZ] Extend the use of C(L)GFR	Richard Sandiford	2013-12-06	2	-4/+43
\| \| \| \| \| \| \|	instcombine prefers to put extended operands first, so this patch handles that case for C(L)GFR. llvm-svn: 196579
*	[SystemZ] Optimize selects between 0 and -1	Richard Sandiford	2013-12-06	3	-0/+543
\| \| \| \| \| \| \| \| \| \| \| \| \|	Since z has no setcc instruction as such, the choice of setBooleanContents is a bit arbitrary. Currently it's set to ZeroOrOneBooleanContent, so we produced a branch-free form when selecting between 0 and 1, but not when selecting between 0 and -1. This patch handles the latter case too. At some point I'd like to measure whether it's better to use conditional moves for constant selects on z196, but that's future work. llvm-svn: 196578
*	[SystemZ] Fix choice of known-zero mask in insertion optimization	Richard Sandiford	2013-12-03	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The backend converts 64-bit ORs into subreg moves if the upper 32 bits of one operand and the low 32 bits of the other are known to be zero. It then tries to peel away redundant ANDs from the upper 32 bits. Since AND masks are canonicalized to exclude known-zero bits, the test ORs the mask and the known-zero bits together before checking for redundancy. The problem was that it was using the wrong node when checking for known-zero bits, so could drop ANDs that were still needed. llvm-svn: 196267
*	[SystemZ] Fix incorrect use of RISBG for a zero-extended right shift	Richard Sandiford	2013-11-26	1	-0/+14
\| \| \| \| \| \| \| \| \|	We would wrongly transform the testcase into the equivalent of an AND with 1. The problem was that, when testing whether the shifted-in bits of the right shift were significant, we used the width of the final zero-extended result rather than the width of the shifted value. llvm-svn: 195731
*	[SystemZ] Fix TMHH and TMHL usage for z10 with -O0	Richard Sandiford	2013-11-22	2	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \|	I've no idea why I decided to handle TMxx differently from all the other high/low logic operations, but it was a stupid thing to do. The high registers aren't available as separate 32-bit registers on z10, so subreg_h32 can't be used on a GR64 there. I've normally been testing with z196 and with -O3 and so hadn't noticed this until now. llvm-svn: 195473
*	[SystemZ] Automatically detect zEC12 and z196 hosts	Richard Sandiford	2013-10-31	20	-42/+72
\| \| \| \| \| \| \| \| \| \|	As on other hosts, the CPU identification instruction is priveleged, so we need to look through /proc/cpuinfo. I copied the PowerPC way of handling "generic". Several tests were implicitly assuming z10 and so failed on z196. llvm-svn: 193742
*	[SystemZ] Set usaAA to true	Richard Sandiford	2013-10-28	5	-18/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	useAA significantly improves the handling of vector code that has TBAA information attached. It also helps other cases, as shown by the testsuite changes here. The only real downside I've seen is that it interferes with MergeConsecutiveStores. The problem is that that optimization works top down, starting at the first store in the chain, and looks for cases where the chain result is only used by a single related store. These related stores don't alias, so useAA will have rewritten all the later stores to use a different chain input (typically the same one as the first store). I think the advantages outweigh the disadvantages though, so for now I've just disabled alias analysis for the unaligned-01.ll test. llvm-svn: 193521
*	[DAGCombiner] Respect volatility when checking for aliases	Richard Sandiford	2013-10-28	1	-1/+2
\| \| \| \| \| \| \| \|	Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518
*	Keep TBAA info when rewriting SelectionDAG loads and stores	Richard Sandiford	2013-10-28	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517
*	Replace sra with srl if a single sign bit is required	Richard Sandiford	2013-10-17	1	-0/+12
\| \| \| \| \| \|	E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2). llvm-svn: 192884
*	[SystemZ] Handle extensions in RxSBG optimizations	Richard Sandiford	2013-10-16	1	-3/+2
\| \| \| \| \| \| \|	The input to an RxSBG operation can be narrower as long as the upper bits are don't care. This fixes a FIXME added in r192783. llvm-svn: 192790
*	[SystemZ] Improve handling of SETCC	Richard Sandiford	2013-10-16	3	-16/+260
\| \| \| \| \| \| \| \|	We previously used the default expansion to SELECT_CC, which in turn would expand to "LHI; BRC; LHI". In most cases it's better to use an IPM-based sequence instead. llvm-svn: 192784