summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [ValueTracking] Teach isKnownNonZero a new trickJames Molloy2015-09-242-0/+31
| | | | | | | | If the shifter operand is a constant, and all of the bits shifted out are known to be zero, then if X is known non-zero at least one non-zero bit must remain. llvm-svn: 248508
* [objdump] Make iterator operator* return a reference.Benjamin Kramer2015-09-241-1/+1
| | | | | | | This is closer to the expected behavior of an iterator and avoids awkward warnings from clang's -Wrange-loop-analysis below. llvm-svn: 248497
* Regression Test: Deletes redundant/invalid test.Mohammad Shahid2015-09-241-242/+0
| | | | | | | | Removes absdiff_expand.ll regression test file which is invalid. Diffrential Revision: http://reviews.llvm.org/D11678 llvm-svn: 248493
* [mips] Use PredicateControl for the MSA ASE instructions. NFC.Daniel Sanders2015-09-243-22/+23
| | | | | | | | | | Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13092 llvm-svn: 248486
* Codegen: Fix llvm.*absdiff semantic.Mohammad Shahid2015-09-244-26/+245
| | | | | | | | Fixes the overflow case of llvm.*absdiff intrinsic also updats the tests and LangRef.rst accordingly. Differential Revision: http://reviews.llvm.org/D11678 llvm-svn: 248483
* [InstCombine] Recognize another bswap idiom.Charlie Turner2015-09-242-6/+22
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The byte-swap recognizer can now notice that this ``` uint32_t bswap(uint32_t x) { x = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >> 16; x = (x & 0x00FF00FF) << 8 | (x & 0xFF00FF00) >> 8; return x; } ``` is a bswap. Fixes PR23863. Reviewers: nlewycky, hfinkel, hans, jmolloy, rengolin Subscribers: majnemer, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12637 llvm-svn: 248482
* Introduce target hook for optimizing register copiesMatt Arsenault2015-09-2414-158/+281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow a target to do something other than search for copies that will avoid cross register bank copies. Implement for SI by only rewriting the most basic copies, so it should look through anything like a subregister extract. I'm not entirely satisified with this because it seems like eliminating a reg_sequence that isn't fully used should work generically for all targets without them having to override something. However, it seems to be tricky to have a simple implementation of this without rewriting to invalid kinds of subregister copies on some targets. I'm not sure if there is currently a generic way to easily check if a subregister index would be valid for the current use. The current set of TargetRegisterInfo::get*Class functions don't quite behave like I would expect (e.g. getSubClassWithSubReg returns the maximal register class rather than the minimal), so I'm not sure how to make the generic test keep searching if SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making the default implementation to check for simple copies breaks a variety of ARM and x86 tests by producing illegal subregister uses. The ARM tests are not actually changed since it should still be using the same sharesSameRegisterFile implementation, this just relaxes them to not check for specific registers. llvm-svn: 248478
* AMDGPU: Return after instruction is processed.Matt Arsenault2015-09-241-0/+4
| | | | llvm-svn: 248476
* AMDGPU: Remove another unnecessary check from commuteInstructionMatt Arsenault2015-09-241-5/+0
| | | | llvm-svn: 248475
* AMDGPU: Add readonly to InstrMapping functionsMatt Arsenault2015-09-241-1/+15
| | | | llvm-svn: 248474
* TableGen: Add LLVM_READONLY to generated InstrMapping functionsMatt Arsenault2015-09-241-1/+1
| | | | | | These just read from a generated table. llvm-svn: 248473
* AMDGPU: Fix printing trailing whitespace for mubuf atomicsMatt Arsenault2015-09-242-11/+11
| | | | llvm-svn: 248472
* Remove dead declarationMatt Arsenault2015-09-241-1/+0
| | | | llvm-svn: 248471
* Use new TokenFactor chain when merging storesMatt Arsenault2015-09-242-5/+82
| | | | | | | | | | | | | | | | | | | | | If the stores are storing values from loads which partially alias the stores, we could end up placing the merged loads and stores on the same chain which has the potential to break. Each store may have a different chain dependency on only some of the original loads. Create a new TokenFactor to capture all of the required dependencies of the stores rather than assuming all stores can use the same chain. The testcase is a situation where this happens, although it does not have an observable change from this. The DAG nodes just happened to not be reordered before despite this missing chain dependency. This is based on an off-list report for an out of tree target which regressed due to r246307 and I haven't managed to find a case where the nodes do end up reordered with an in tree target. llvm-svn: 248468
* AMDGPU: Reduce number of copies emittedMatt Arsenault2015-09-244-21/+19
| | | | | | | | | | | | | Instead of always inserting a copy in case the super register is itself a subregister, only extract to the super reg class if this is actually the case. This shouldn't really change codegen, but makes looking at the output of SIFixSGPRCopies easier to read. llvm-svn: 248467
* Fix a think-o in which functions these should surroundJustin Bogner2015-09-241-2/+2
| | | | llvm-svn: 248465
* Add some NDEBUG checks I accidentally dropped in r248462Justin Bogner2015-09-241-0/+2
| | | | llvm-svn: 248464
* BasicAA: Move BasicAAResult::alias out-of-line. NFCJustin Bogner2015-09-242-41/+42
| | | | | | | This makes the header more readable and cleans up some unnecessary header differences between NDEBUG and !NDEBUG. llvm-svn: 248462
* Add CFG Simplification pass after Loop Unswitching.Michael Zolotukhin2015-09-241-0/+1
| | | | | | | | | | | | | | Loop unswitching produces conditional branches with constant condition, and it's beneficial for later passes to clean this up with simplify-cfg. We do this after the second invocation of loop-unswitch, but not after the first one. Not doing so might cause problem for passes like LoopUnroll, whose estimate of loop body size would be less accurate. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D13064 llvm-svn: 248460
* [safestack] Fix compiler crash in the presence of stack restores.Evgeniy Stepanov2015-09-242-7/+37
| | | | | | | A use can be emitted before def in a function with stack restore points but no static allocas. llvm-svn: 248455
* [IR] Teach `llvm::User` to co-allocate a descriptor.Sanjoy Das2015-09-243-4/+83
| | | | | | | | | | | | | | | | | | | | | | | | Summary: With this change, subclasses of `llvm::User` will be able to co-allocate a variable number of bytes (called a "descriptor") with the `llvm::User` instance. The co-allocated descriptor can later be accessed using `llvm::User::getDescriptor`. This will be used in later changes to implement operand bundles. This change steals one bit from `NumUserOperands`, but given that it is still 28 bits wide I don't think this will be a practical issue. This change does not allow allocating hung off uses with descriptors. This only for simplicity, not for any fundamental reason; and we can easily add this functionality later if needed. Reviewers: reames, chandlerc, dexonsmith, kmod, majnemer, pete, JosephTremoulet Subscribers: pete, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12455 llvm-svn: 248453
* Add REQUIRES: default_triple to these testcases.Adrian Prantl2015-09-242-1/+3
| | | | llvm-svn: 248452
* Remove iterator_range::end.Rui Ueyama2015-09-241-1/+0
| | | | | | | Because the current proposal does not include that member function, and we are trying to keep in line with that. llvm-svn: 248451
* Add iterator_range::end() predicate.Rui Ueyama2015-09-231-0/+1
| | | | llvm-svn: 248447
* [Unroll] When completely unrolling the loop, replace conditinal branches ↵Michael Zolotukhin2015-09-231-2/+3
| | | | | | | | | | | | | | | with unconditional. Nothing is expected to change, except we do less redundant work in clean-up. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12951 llvm-svn: 248444
* Put profile variables of COMDAT functions to it's own COMDAT group.Wei Mi2015-09-232-9/+13
| | | | | | | | | | | | | | | | | In -fprofile-instr-generate compilation, to remove the redundant profile variables for the COMDAT functions, these variables are placed in the same COMDAT group as its associated function. This way when the COMDAT function is not picked by the linker, those profile variables will also not be output in the final binary. This may cause warning when mix link objects built w and wo -fprofile-instr-generate. This patch puts the profile variables for COMDAT functions to its own COMDAT group to avoid the problem. Patch by xur. Differential Revision: http://reviews.llvm.org/D12248 llvm-svn: 248440
* set div/rem default values to 'expensive' in TargetTransformInfo's cost modelSanjay Patel2015-09-232-0/+35
| | | | | | | | | | | | | | | | | | | ...because that's what the cost model was intended to do. As discussed in D12882, this fix has a temporary unintended consequence for SimplifyCFG: it causes us to not speculate an fdiv. However, two wrongs make PR24818 right, and two wrongs make PR24343 act right even though it's really still wrong. I intend to correct SimplifyCFG and add to CodeGenPrepare to account for this cost model change and preserve the righteousness for the bug report cases. https://llvm.org/bugs/show_bug.cgi?id=24818 https://llvm.org/bugs/show_bug.cgi?id=24343 Differential Revision: http://reviews.llvm.org/D12882 llvm-svn: 248439
* ARM: fix folding stack adjustment (again again again...)Tim Northover2015-09-232-2/+4
| | | | | | | | | | | | This time, the issue is that we weren't accounting for the possibility that aligned DPRs could have been stored after the final "push" in a prologue. When that happened we effectively moved a "sub sp, #N" from below the aligned stores to above them, and everything went to pot. To make it worse, I'd actually committed something testing that we produced wrong code, so the test update is tiny. llvm-svn: 248437
* dsymutil: Don't prune forward declarations inside a module definition.Adrian Prantl2015-09-235-9/+17
| | | | llvm-svn: 248428
* Fix this dsymutil testcase by not passing in a path to the modulemap file,Adrian Prantl2015-09-232-3/+1
| | | | | | | | so the lookup works as expected after prepending the oso-prepend-path. This manifested only on Windows, because "/" is not a relative path there. llvm-svn: 248423
* Remove handling of AddrSpaceCast in stripAndAccumulateInBoundsConstantOffsetsPhilip Reames2015-09-232-2/+29
| | | | | | | | | | Patch by: simoncook Unlike BitCasts, AddrSpaceCasts do not always produce an output the same size as its input, which was previously assumed. This fixes cases where two address spaces do not have the same size pointer, as an assertion failure would occur when trying to prove deferenceability. LoopUnswitch is used in the particular test, but LICM also exhibits the same problem. Differential Revision: http://reviews.llvm.org/D13008 llvm-svn: 248422
* Swap loop invariant GEP with loop variant GEP to allow more LICM.Lawrence Hu2015-09-232-8/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the order of GEPs generated by Splitting GEPs pass, specially when one of the GEPs has constant and the base is loop invariant, then we will generate the GEP with constant first when beneficial, to expose more cases for LICM. If originally Splitting GEP generate the following: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 %3 %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032 ... Now it genereates: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 1032 %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3 ... For no-loop cases, the original way of generating GEPs seems to expose more CSE cases, so we don't change the logic for no-loop cases, and only limit our change to the specific case we are interested in. llvm-svn: 248420
* [InstCombine] Preserve metadata when merging loads that are phiAkira Hatanaka2015-09-232-6/+88
| | | | | | | | | | | | | | | | | | | | arguments. Make sure InstCombiner::FoldPHIArgLoadIntoPHI doesn't drop the following metadata: MD_tbaa MD_alias_scope MD_noalias MD_invariant_load MD_nonnull MD_range rdar://problem/17617709 Differential Revision: http://reviews.llvm.org/D12710 llvm-svn: 248419
* [docs] Update DominatorTree docs to clarify expectations around unreachable ↵Philip Reames2015-09-232-2/+20
| | | | | | | | | | | | blocks Note: I'm am not trying to describe what "should be"; I'm only describing what is true today. This came out of my recent question to llvm-dev titled: When can the dominator tree not contain a node for a basic block? Differential Revision: http://reviews.llvm.org/D13078 llvm-svn: 248417
* [x86] replace integer 'xor' ops with packed SSE FP 'xor' ops when operating ↵Sanjay Patel2015-09-232-4/+4
| | | | | | | | | | | | | | | | | | | | | | | on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx xorl %eax, %ecx movd %ecx, %xmm0 into this: xorps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 llvm-svn: 248415
* [x86] replace integer 'or' ops with packed SSE FP 'or' ops when operating on ↵Sanjay Patel2015-09-232-4/+4
| | | | | | | | | | | | | | | | | | | | | | | FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx orl %eax, %ecx movd %ecx, %xmm0 into this: orps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 llvm-svn: 248409
* Fix the order of operations.Adrian Prantl2015-09-231-1/+1
| | | | llvm-svn: 248406
* Android support for SafeStack.Evgeniy Stepanov2015-09-2313-43/+190
| | | | | | | | | | | | | | | | | Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). This is a re-commit of a change in r248357 that was reverted in r248358. llvm-svn: 248405
* move call to convertIntLogicToFPLogic up; NFCISanjay Patel2015-09-231-3/+3
| | | | | | | The BEXTR comments didn't make sense before, we may want to extend the FP logic transform to work on vectors, and this way is more beautiful. llvm-svn: 248404
* Temporarily make testcase more verbose to debug a msvc buildbot failure.Adrian Prantl2015-09-231-1/+3
| | | | llvm-svn: 248403
* [Bug 24848] Use range metadata to constant fold comparisons with constant valuesChen Li2015-09-232-2/+89
| | | | | | | | | | | | | | | Summary: This is the first part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. When range metadata is provided, it should be used to constant fold comparisons with constant values. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12988 llvm-svn: 248402
* [x86] move code for converting int logic to FP logic to a helper function; NFCISanjay Patel2015-09-231-17/+37
| | | | | | | | | This is a follow-on to: http://reviews.llvm.org/rL248395 so we can add the call to the or/xor combines too. llvm-svn: 248399
* dsymutil: Resolve forward decls for types defined in clang modules.Adrian Prantl2015-09-233-44/+122
| | | | | | | | | This patch extends llvm-dsymutil's ODR type uniquing machinery to also resolve forward decls for types defined in clang modules. http://reviews.llvm.org/D13038 llvm-svn: 248398
* dsymutil: print a warning when there is a module hash mismatch.Adrian Prantl2015-09-238-14/+63
| | | | | | | This also updates the module binaries in the test directory because their module hash mismatched. llvm-svn: 248396
* [x86] replace integer 'and' ops with packed SSE FP 'and' ops when operating ↵Sanjay Patel2015-09-232-11/+22
| | | | | | | | | | | | | | | | | | | | | on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx andl %eax, %ecx movd %ecx, %xmm0 into this: andps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 Differential Revision: http://reviews.llvm.org/D13065 llvm-svn: 248395
* [WebAssembly] Fix hasAddr64 being used before being initializer.Dan Gohman2015-09-231-20/+36
| | | | | | | | | | This reverts r248388 and fixes the underlying bug: hasAddr64 was initialized in runOnMachineFunction, but runOnMachineFunction isn't ever called in CodeGen/WebAssembly/global.ll since that testcase has no functions. The fix here is to use AsmPrinter's getPointerSize() as needed to determine the pointer size instead. llvm-svn: 248394
* [Inline] Use AssumptionCache from the right FunctionVedant Kumar2015-09-232-1/+32
| | | | | | | | | | | | | | | This changes the behavior of AddAligntmentAssumptions to match its comment. I.e, prove the asserted alignment in the context of the caller, not the callee. Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko who also saw a fix for the issue. rdar://22521387 Differential Revision: http://reviews.llvm.org/D12997 llvm-svn: 248390
* Fix CodeGen/WebAssembly/global.ll test under ASAN.Alexander Kornienko2015-09-231-1/+1
| | | | llvm-svn: 248388
* [DeadArgElim] Split the invoke successor edgeDavid Majnemer2015-09-232-5/+29
| | | | | | | | | | | | | | | | | | | Invoking a function which returns an aggregate can sometimes be transformed to return a scalar value. However, this means that we need to create an insertvalue instruction(s) to recreate the correct aggregate type. We achieved this by inserting an insertvalue instruction at the invoke's normal successor. However, this is not feasible if the normal successor uses the invoke's return value inside a PHI node. Instead, split the edge between the invoke and the unwind successor and create the insertvalue instruction in the new basic block. The new basic block's successor will be the old invoke successor which leaves us with IR which is well behaved. This fixes PR24906. llvm-svn: 248387
* [AArch64] Refactor pre- and post-index merge fuctions into a single ↵Chad Rosier2015-09-231-59/+16
| | | | | | function. NFC. llvm-svn: 248377
OpenPOWER on IntegriCloud