summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* Add a triple to this test.Chad Rosier2012-12-111-1/+1
| | | | llvm-svn: 169803
* Fix a miscompile in the DAG combiner. Previously, we would incorrectlyChandler Carruth2012-12-111-2/+23
| | | | | | | | | | | | | | | | | | | | | | | try to reduce the width of this load, and would end up transforming: (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32) to (truncate (zextload i32 <ptr+4> as i64) to i32) We lost the sext attached to the load while building the narrower i32 load, and replaced it with a zext because lshr always zext's the results. Instead, bail out of this combine when there is a conflict between a sextload and a zext narrowing. The rest of the DAG combiner still optimize the code down to the proper single instruction: movswl 6(...),%eax Which is exactly what we wanted. Previously we read past the end *and* missed the sign extension: movl 6(...), %eax llvm-svn: 169802
* move X86-specific testPaul Redmond2012-12-111-1/+1
| | | | | | | | This test case uses -mcpu=corei7 so it belongs in CodeGen/X86 Reviewed by: Nadav llvm-svn: 169801
* Fall back to the selection dag isel to select tail calls.Chad Rosier2012-12-111-3/+2
| | | | | | | | | | | | | | | | | | | This shouldn't affect codegen for -O0 compiles as tail call markers are not emitted in unoptimized compiles. Testing with the external/internal nightly test suite reveals no change in compile time performance. Testing with -O1, -O2 and -O3 with fast-isel enabled did not cause any compile-time or execution-time failures. All tests were performed on my x86 machine. I'll monitor our arm testers to ensure no regressions occur there. In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While it's theoretically true that this is just an optimization, it's an optimization that we very much want to happen even at -O0, or else ARC applications become substantially harder to debug. Part of rdar://12553082 llvm-svn: 169796
* Refactor out the abbreviation handling into a separate class thatEric Christopher2012-12-101-1/+1
| | | | | | | | | | | | controls each of the abbreviation sets (only a single one at the moment) and computes offsets separately as well for each set of DIEs. No real function change, ordering of abbreviations for the skeleton CU changed but only because we're computing in a separate order. Fix the testcase not to care. llvm-svn: 169793
* Some enhancements for memcpy / memset inline expansion.Evan Cheng2012-12-106-39/+143
| | | | | | | | | | | | | | | | | | | | | 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do *not* replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791
* Optimistically analyse Phi cyclesArnold Schwaighofer2012-12-101-2/+63
| | | | | | | | | | Analyse Phis under the starting assumption that they are NoAlias. Recursively look at their inputs. If they MayAlias/MustAlias there must be an input that makes them so. Addresses bug 14351. llvm-svn: 169788
* Add a test for explicitly exercising the mc-relax-all flag.Eli Bendersky2012-12-101-0/+19
| | | | llvm-svn: 169764
* Use the somewhat semantic term "split dwarf" it more matches what'sEric Christopher2012-12-101-1/+1
| | | | | | going on and makes a lot of the terminology in comments make more sense. llvm-svn: 169758
* Add support for reverse induction variables. For example:Nadav Rotem2012-12-101-4/+2
| | | | | | | while (i--) sum+=A[i]; llvm-svn: 169752
* Use GetUnderlyingObjects in mischedHal Finkel2012-12-101-0/+101
| | | | | | | | | | | | | | | | misched used GetUnderlyingObject in order to break false load/store dependencies, and the -enable-aa-sched-mi feature similarly relied on GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis. Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so (especially due to LSR) all of these mechanisms failed for induction-variable-dependent loads and stores inside loops. This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects (which will recurse through phi and select instructions) in misched. Andy reviewed, tested and simplified this patch; Thanks! llvm-svn: 169744
* Teach DAG combine to handle vector add/sub with vectors of all 0s.Craig Topper2012-12-102-5/+5
| | | | llvm-svn: 169727
* Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores.Chandler Carruth2012-12-102-7/+31
| | | | | | | | | | | | | | | | | | | When SROA was evaluating a mixture of i1 and i8 loads and stores, in just a particular case, it would tickle a latent bug where we compared bits to bytes rather than bits to bits. As a consequence of the latent bug, we would allow integers through which were not byte-size multiples, a situation the later rewriting code was never intended to handle. In release builds this could trigger all manner of oddities, but the reported issue in PR14548 was forming invalid bitcast instructions. The only downside of this fix is that it makes it more clear that SROA in its current form is not capable of handling mixed i1 and i8 loads and stores. Sometimes with the previous code this would work by luck, but usually it would crash, so I'm not terribly worried. I'll watch the LNT numbers just to be sure. llvm-svn: 169719
* LoopVectorize: support vectorizing intrinsic callsPaul Redmond2012-12-091-0/+851
| | | | | | | | | - added function to VectorTargetTransformInfo to query cost of intrinsics - vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc. Reviewed by: Nadav llvm-svn: 169711
* Drop the address space limit for tests in the makefile build.Benjamin Kramer2012-12-091-2/+4
| | | | | | | | | | | The limit seems to break newer pythons (see PR13598) so just drop it for now. Eventually lit should learn to set limits for its children instead of a global limit in the makefile. If some PPC bots fail after this change: That's a good thing, they actually run clang tests now. llvm-svn: 169695
* - Re-enable population count loop idiom recognization Shuxin Yang2012-12-092-0/+126
| | | | | | | - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687
* Teach DAG combine to handle vector logical operations with vectors of all 1s ↵Craig Topper2012-12-083-23/+21
| | | | | | or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined. llvm-svn: 169684
* Revert the patches adding a popcount loop idiom recognition pass.Chandler Carruth2012-12-082-82/+0
| | | | | | | | | | | | | | There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683
* When we use the BLEND instruction that uses the MSB as a mask, we can removeNadav Rotem2012-12-072-2/+2
| | | | | | | | the VSRI instruction before it since it does not affect the MSB. Thanks Craig Topper for suggesting this. llvm-svn: 169638
* In hexagon convertToHardwareLoop, don't deref end() iteratorMatthew Curtis2012-12-071-1/+1
| | | | | | | | | | In particular, check if MachineBasicBlock::iterator is end() before using it to call getDebugLoc(); See also this thread on llvm-commits: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html llvm-svn: 169634
* X86: Prefer using VPSHUFD over VPERMIL because it has better throughput.Nadav Rotem2012-12-073-5/+5
| | | | llvm-svn: 169624
* Add separate statistics for Data and Inst fragments emitted during relaxation.Eli Bendersky2012-12-071-4/+1
| | | | | | | Also fixes a test that was overly-sensitive to the exact order of statistics emitted. llvm-svn: 169619
* Added Mapping Symbols for ARM ELFTim Northover2012-12-0711-19/+274
| | | | | | | | | | Before this patch, when you objdump an LLVM-compiled file, objdump tried to decode data-in-code sections as if they were code. This patch adds the missing Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D). Patch based on work by Greg Fitzgerald. llvm-svn: 169609
* The test unconditionally assumes a particular cpu has a backend build in the ↵David Tweed2012-12-072-0/+6
| | | | | | | | | | target. Buildbots for some hosts may choose to build only their own backend in order to maximise testing-turnaround time. Move the test into a prefixed directory so lit's standard "backend specific" suppression can be done. llvm-svn: 169604
* Add support to ValueTracking for determining that a pointer is non-nullChandler Carruth2012-12-071-0/+40
| | | | | | | | | | | | | | | | | | | | | | | by virtue of inbounds GEPs that preclude a null pointer. This is a very common pattern in the code generated by std::vector and other standard library routines which use allocators that test for null pervasively. This is one step closer to teaching Clang+LLVM to be able to produce an empty function for: void f() { std::vector<int> v; v.push_back(1); v.push_back(2); v.push_back(3); v.push_back(4); } Which is related to getting them to completely fold SmallVector push_back sequences into constants when inlining and other optimizations make that a possibility. llvm-svn: 169573
* Fix typos in CHECK lines.Dmitri Gribenko2012-12-064-5/+5
| | | | | | Patch by Alexander Zinenko. llvm-svn: 169547
* Fix a bug in the code that merges consecutive stores. Previously we did notNadav Rotem2012-12-061-0/+23
| | | | | | | check if loads that happen in between stores alias with the first store in the chain, only with the second store onwards. llvm-svn: 169516
* [msan] Do not store origin for clean values.Evgeniy Stepanov2012-12-061-0/+26
| | | | | | | | | | | | | | | | | | Instead of unconditionally storing origin with every application store, only do this when the shadow of the stored value is != 0. This change also delays instrumentation of stores until after the walk over function's instructions, because adding new basic blocks confuses InstVisitor. We only keep 1 origin value per 4 bytes of application memory. This change fixes the bug when a store of a single clean byte wiped the origin for the whole 4-byte area. Since stores of uninitialized values are relatively uncommon, this change improves performance of track-origins mode by 5% median and by up to 47% on specs. llvm-svn: 169490
* Handle non-default array bounds.Bill Wendling2012-12-061-0/+48
| | | | | | | | | Some languages, e.g. Ada and Pascal, allow you to specify that the array bounds are different from the default (1 in these cases). If we have a lower bound that's non-default, then we emit the lower bound. We also calculate the correct upper bound in those cases. llvm-svn: 169484
* Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing ↵Craig Topper2012-12-061-1/+4
| | | | | | to the normal instructions. llvm-svn: 169482
* Properly fix the tes.Evan Cheng2012-12-061-2/+1
| | | | llvm-svn: 169464
* llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I ↵NAKAMURA Takumi2012-12-061-1/+1
| | | | | | guess Chad expects fastisel here. llvm-svn: 169463
* [arm fast-isel] Make the fast-isel implementation of memcpy respect alignment.Chad Rosier2012-12-061-3/+94
| | | | | | rdar://12821569 llvm-svn: 169460
* Let targets provide hooks that compute known zero and ones for any_extendEvan Cheng2012-12-061-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 llvm-svn: 169459
* PR10867: Analogue of r169441 for when using external 'sh'. And actually run ↵Richard Smith2012-12-051-0/+1
| | | | | | the test! llvm-svn: 169446
* PR10867. lit would interpretRichard Smith2012-12-051-0/+9
| | | | | | | | | | | RUN: a RUN: b || true as "a && (b || true)" in Tcl mode, and as "(a && b) || true" in sh mode. Everyone seems to (quite reasonably) write tests assuming the Tcl behavior, so use that in sh mode too. llvm-svn: 169441
* RegisterPressureTracker: fix findUseBetween to handle DebugValueAndrew Trick2012-12-051-0/+49
| | | | llvm-svn: 169427
* RegisterPresssureTracker: Track live physical register by unit.Andrew Trick2012-12-051-0/+30
| | | | | | | | This is much simpler to reason about, more efficient, and fixes some corner cases involving implicit super-register defs. Fixed rdar://12797931. llvm-svn: 169425
* Cost Model: change the default cost of control flow instructions (br / ret / ↵Nadav Rotem2012-12-056-9/+9
| | | | | | ...) to zero. llvm-svn: 169423
* Correct ARM NOP encodingDavid Sehr2012-12-051-1/+1
| | | | | | | | | | The encoding of NOP in ARMAsmBackend.cpp is missing a trailing zero, which causes the emission of a coprocessor instruction rather than "mov r0, r0" as indicated in the comment. The test also checks for the wrong encoding. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/157919.html llvm-svn: 169420
* [NVPTX] Fix crash with unnamed struct argumentsJustin Holewinski2012-12-051-0/+5
| | | | | | Patch by Eric Holk llvm-svn: 169418
* Add dump of Win64 EH unwind data.Michael J. Spencer2012-12-051-0/+106
| | | | | | | | | | The new command line option -unwind-info dumps the Win64 EH unwind data to the console. This is a nice feature if you need to debug generated EH data (e.g. from LLVM). Includes a test case. Initial patch by João Matos, extensions and rework by Kai Nacke. llvm-svn: 169415
* Test commit.David Sehr2012-12-051-1/+1
| | | | llvm-svn: 169410
* Use multiclass to define store instructions with base+immediate offsetJyotsna Verma2012-12-052-4/+3
| | | | | | addressing mode and immediate stored value. llvm-svn: 169408
* Added a option to the disassembler to print immediates as hex.Kevin Enderby2012-12-052-0/+15
| | | | | | | | | | | | | | | | | | | | This is for the lldb team so most of but not all of the values are to be printed as hex with this option. Some small values like the scale in an X86 address were requested to printed in decimal without the leading 0x. There may be some tweaks need to places that may still be in decimal that they want in hex. Specially for arm. I made my best guess. Any tweaks from here should be simple. I also did the best I know now with help from the C++ gurus creating the cleanest formatImm() utility function and containing the changes. But if someone has a better idea to make something cleaner I'm all ears and game for changing the implementation. rdar://8109283 llvm-svn: 169393
* [msan] Instrument bswap intrinsic.Evgeniy Stepanov2012-12-051-0/+16
| | | | llvm-svn: 169383
* [msan] Change linkage type of __msan_track_origins.Evgeniy Stepanov2012-12-051-0/+3
| | | | | | | LinkOnceODRLinkage globals may be removed in GlobalOpt if not used in the current module. llvm-svn: 169377
* Simplified BLEND pattern matching for shuffles.Elena Demikhovsky2012-12-052-6/+53
| | | | | | Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. llvm-svn: 169366
* fix a typoShuxin Yang2012-12-051-1/+1
| | | | llvm-svn: 169345
* Add x86 isel lowering logic to form bit test with inverted condition. e.g.Evan Cheng2012-12-051-3/+97
| | | | | | | | | x ^ -1. Patch by David Majnemer. rdar://12755626 llvm-svn: 169339
OpenPOWER on IntegriCloud