summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Addition of interfaces the BE to conform to Table A-2 of ELF V2 ABI V1.1Nemanja Ivanovic2015-09-292-31/+43
| | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D13191 Back end portion of the fifth round of additions to altivec.h. llvm-svn: 248809
* [AArch64] Scale offsets by the size of the memory operation. NFC.Chad Rosier2015-09-291-17/+21
| | | | | | | | | The immediate in the load/store should be scaled by the size of the memory operation, not the size of the register being loaded/stored. This change gets us one step closer to forming LDPSW instructions. This change also enables pre- and post-indexing for halfword and byte loads and stores. llvm-svn: 248804
* [ValueTracking] Lower dom-conditions-dom-blocks and dom-conditions-max-uses ↵Igor Laevsky2015-09-291-2/+2
| | | | | | | | | | thresholds On some of our benchmarks this change shows about 50% compile time improvement without any noticeable performance difference. Differential Revision: http://reviews.llvm.org/D13248 llvm-svn: 248801
* [AArch64] Remove some redundant cases. NFC.Chad Rosier2015-09-291-23/+15
| | | | llvm-svn: 248800
* [ValueTracking] Teach isKnownNonZero about monotonically increasing PHIsJames Molloy2015-09-291-0/+20
| | | | | | | | If a PHI starts at a non-negative constant, monotonically increases (only adds of a constant are supported at the moment) and that add does not wrap, then the PHI is known never to be zero. llvm-svn: 248796
* Arguments spilled on the stack before a function call may haveJeroen Ketema2015-09-293-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alignment requirements, for example in the case of vectors. These requirements are exploited by the code generator by using move instructions that have similar alignment requirements, e.g., movaps on x86. Although the code generator properly aligns the arguments with respect to the displacement of the stack pointer it computes, the displacement itself may cause misalignment. For example if we have %3 = load <16 x float>, <16 x float>* %1, align 64 call void @bar(<16 x float> %3, i32 0) the x86 back-end emits: movaps 32(%ecx), %xmm2 movaps (%ecx), %xmm0 movaps 16(%ecx), %xmm1 movaps 48(%ecx), %xmm3 subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such. movl $0, 16(%esp) calll __bar To solve this, we need to make sure that the computed value with which the stack pointer is changed is a multiple af the maximal alignment seen during its computation. With this change we get proper alignment: subl $32, %esp movaps %xmm3, (%esp) Differential Revision: http://reviews.llvm.org/D12337 llvm-svn: 248786
* [InstCombine] Improve Vector Demanded Bits Through BitcastsSimon Pilgrim2015-09-291-35/+34
| | | | | | | | | | | | Currently SimplifyDemandedVectorElts can only peek through bitcasts if the vectors have the same number of elements. This patch fixes and enables some existing (disabled) code to support bitcasting to vectors with more/fewer elements. It currently only accepts cases when vectors alias cleanly (i.e. number of elements are an exact multiple of the other vector). This was added to improve the demanded vector elements support for SSE vector shifts which require the __m128i (<2 x i64>) argument type to be bitcast to the vector type for the builtin shift. I've added extra tests for various additional bitcasts. Differential Revision: http://reviews.llvm.org/D12935 llvm-svn: 248784
* [LoopUnswitch] Add block frequency analysis to recognize hot/cold regionsChen Li2015-09-291-0/+48
| | | | | | | | | | | | Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default. Reviewers: broune, silvas, dnovillo, reames Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D11605 llvm-svn: 248777
* [CMake] X86AsmParser: Prune redundant LINK_LIBS.NAKAMURA Takumi2015-09-291-3/+0
| | | | | | It is described in LLVMBuild.txt. llvm-svn: 248771
* Move dbg.declare intrinsics when merging and replacing allocas.Evgeniy Stepanov2015-09-292-4/+12
| | | | | | | | | | | | | | Place new and update dbg.declare calls immediately after the corresponding alloca. Current code in replaceDbgDeclareForAlloca puts the new dbg.declare at the end of the basic block. LLVM codegen has problems emitting debug info in a situation when dbg.declare appears after all uses of the variable. This usually kinda works for inlining and ASan (two users of this function) but not for SafeStack (see the pending change in http://reviews.llvm.org/D13178). llvm-svn: 248769
* RegisterPressure: LiveRegSet tracks register units not physregsMatthias Braun2015-09-291-1/+1
| | | | | | | There are always more physical registers and register units so the previous behaviour was correct but we can do with less memory. llvm-svn: 248767
* [WinEH] Fix ip2state table emission with funcletsReid Kleckner2015-09-285-56/+88
| | | | | | | Previously we were hijacking the old LandingPadInfo data structures to communicate our state numbers. Now we don't need that anymore. llvm-svn: 248763
* Fix unused variable warning in non-debug builds.Richard Trieu2015-09-281-1/+2
| | | | llvm-svn: 248754
* tidy up comments; NFCSanjay Patel2015-09-281-7/+7
| | | | llvm-svn: 248750
* add a FIXME for a CPU model check that should have an attribute insteadSanjay Patel2015-09-281-1/+2
| | | | llvm-svn: 248746
* move one-use check under the comment that describes it; NFCISanjay Patel2015-09-281-3/+2
| | | | llvm-svn: 248745
* [SCEV] Don't crash on pointer comparisonsSanjoy Das2015-09-281-2/+1
| | | | | | | | | | | | `ScalarEvolution::isImpliedCondOperandsViaNoOverflow` tries to cast the operand type of the comparison it is given to an `IntegerType`. This is incorrect because it could actually be simplifying a comparison between two pointers. Switch it to using `getTypeSizeInBits` instead, which does the right thing for both pointers and integers. Fixed PR24956. llvm-svn: 248743
* AMDGPU: Factor switch into separate functionMatt Arsenault2015-09-282-21/+30
| | | | llvm-svn: 248742
* AMDGPU: Fix splitting x16 SMRD loadsMatt Arsenault2015-09-281-2/+2
| | | | | | | | When used recursively, this would set the kill flag on the intermediate step from first splitting x16 to x8. llvm-svn: 248741
* AMDGPU: Fix moving SMRD loads with literal offsets on CIMatt Arsenault2015-09-281-3/+9
| | | | llvm-svn: 248740
* AMDGPU: Fix splitting SMRD with large offsetMatt Arsenault2015-09-281-1/+1
| | | | | | | | | | | | | The splitting of > 4 dword SMRD instructions if using an offset in an SGPR instead of an immediate was not setting the destination register, resulting an an instruction missing an operand which would assert later. Test will be included in a following commit which fixes a related issue. llvm-svn: 248739
* Improved the interface of methods commuting operands, improved X86-FMA3 ↵Andrew Kaylor2015-09-2813-174/+340
| | | | | | | | | | mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735
* [GlobalOpt] Sort members of llvm.used deterministicallySean Silva2015-09-281-1/+2
| | | | | | | | | | | | | | | | | Patch by Jake VanAdrighem! Summary: Fix the way we sort the llvm.used and llvm.compiler.used members. This bug seems to have been introduced in rL183756 through a set of improper casts to GlobalValue*. In subsequent patches this problem was missed and transformed into a getName call on a ConstantExpr. Reviewers: silvas Subscribers: silvas, llvm-commits Differential Revision: http://reviews.llvm.org/D12851 llvm-svn: 248728
* Improve performance of SimplifyInstructionsInBlockFiona Glaser2015-09-281-12/+60
| | | | | | | | | | | | | | | | | | | | 1. Use a worklist, not a recursive approach, to avoid needless revisitation and being repeatedly forced to jump back to the start of the BB if a handle is invalidated. 2. Only insert operands to the worklist if they become unused after a dead instruction is removed, so we don’t have to visit them again in most cases. 3. Use a SmallSetVector to track the worklist. 4. Instead of pre-initting the SmallSetVector like in DeadCodeEliminationPass, only put things into the worklist if they have to be revisited after the first run-through. This minimizes how much the actual SmallSetVector gets used, which saves a lot of time. llvm-svn: 248727
* [mips][p5600] Added P5600 processor and initial scheduler.Daniel Sanders2015-09-284-0/+405
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The P5600 is an out-of-order, superscalar implementation of the MIPS32R5 architecture. The scheduler has a few missing details (see the 'Tricky Instructions' section and some quirks of the P5600 are deliberately omitted due to implementation difficulty and low chance of significant benefit (e.g. the predicate on P5600WriteEitherALU). However, testing on SingleSource is showing significant performance benefits on some apps (seven in the 10-30% range) and only one significant regression (12%) when -pre-RA-sched=linearize is given. Without -pre-RA-sched=linearize the results are more variable. Some do even better (up to 55% improvement) but increased numbers of copies are slowing others down (up to 12%). Overall, the scheduler as it currently stands is a 2.4% win with -pre-RA-sched=linearize and a 2.7% win without -pre-RA-sched=linearize. I'm sure we can improve on this further. For completeness, the FPGA this was tested on shows some failures with and without the P5600 scheduler. These appear to be scheduling related since the two test runs have fairly different sets of failing tests even after accounting for other factors (e.g. spurious connection failures) however it's not P5600 specific since we also get some for the generic scheduler. Reviewers: vkalintiris Subscribers: mpf, llvm-commits, atrick, vkalintiris Differential Revision: http://reviews.llvm.org/D12193 llvm-svn: 248725
* Introduce !align metadata for load instructionArtur Pilipenko2015-09-282-0/+9
| | | | | | | | Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D12853 llvm-svn: 248721
* [InstSimplify] Fold simple known implications to truePhilip Reames2015-09-281-0/+47
| | | | | | | | | | This was split off of http://reviews.llvm.org/D13040 to make it easier to test the correctness of the implication logic. For the moment, this only handles a single easy case which shows up when eliminating and combining range checks. In the (near) future, I plan to extend this for other cases which show up in range checks, but I wanted to make those changes incrementally once the framework was in place. At the moment, the implication logic will be used by three places. One in InstSimplify (this review) and two in SimplifyCFG (http://reviews.llvm.org/D13040 & http://reviews.llvm.org/D13070). Can anyone think of other locations this style of reasoning would make sense? Differential Revision: http://reviews.llvm.org/D13074 llvm-svn: 248719
* [LoopReroll] Ignore debug intrinsicsWeiming Zhao2015-09-281-1/+20
| | | | | | | | | Originally, debug intrinsics and annotation intrinsics may prevent the loop to be rerolled, now they are ignored. Differential Revision: http://reviews.llvm.org/D13150 llvm-svn: 248718
* [WebAssembly] Support for direct call and call_indirect.Dan Gohman2015-09-281-6/+8
| | | | llvm-svn: 248716
* [mips] Handling of immediates bigger than 16 bitsZoran Jovanovic2015-09-282-0/+115
| | | | | | Differential Revision: http://reviews.llvm.org/D10539 llvm-svn: 248706
* [ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall()Artyom Skrobov2015-09-282-25/+25
| | | | | | | | | | | | | | | supportsTailCall() has two callers. Both of them double-check isThumb1Only(), and refuse to proceed with tail-calling in that case. Therefore, it makes sense to move this check to ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized; and to eliminate the extra checks at the call sites. Following a review comment, added an "assert(supportsTailCall())" in IsEligibleForTailCall. NFC. llvm-svn: 248703
* [DAGCombine] Fix getStoreMergeAndAliasCandidates's AA-enabled chain walkingHal Finkel2015-09-281-0/+2
| | | | | | | | | | | | | | | | When AA is being used, non-aliasing stores are canonicalized to use the same chain, and DAGCombiner::getStoreMergeAndAliasCandidates can take advantage of this by looking only as users of a store's chain operand. However, user iteration is not result-number specific, we need to check that the use is as a chain operand, and not via some other operand. It is certainly possible to have another potentially-aliasing store, which shares the first's base pointer, and uses the first's chain's node via some other operand. Failure to catch this situation caused, at least in the included test case, an assert later because the relative sequence-number ordering caused later replacement to create a cycle in the DAG. llvm-svn: 248698
* Remove 'const' from some ArrayRefs. ArrayRefs are already immutable. NFCCraig Topper2015-09-284-9/+9
| | | | llvm-svn: 248693
* AsmWriter: Print the argument names in declarations while debuggingJustin Bogner2015-09-271-23/+31
| | | | | | | | | | | | | When llvm declarations have argument names, it's helpful to actually print those names when debugging. Arguably, it'd be nice to print them all the time, but that would mean the IR we output wouldn't round trip through bitcode, which doesn't store the names. Make the varous print() methods in AsmWriter optionally print "for debug" and set that flag in the dump() methods. The only thing this does differently for now is print the argument names in declarations. llvm-svn: 248692
* Silence clang warning: variable ‘Status’ set but not used.Yaron Keren2015-09-271-1/+1
| | | | llvm-svn: 248691
* [SCEV] identical instructions don't compute equal valuesSanjoy Das2015-09-271-1/+8
| | | | | | | | | | | | Before this change `HasSameValue` would return true for distinct `alloca` instructions if they happened to be allocating the same type (`alloca` instructions are not specified as reading memory). This change adds an explicit whitelist of instruction types for which "identical" instructions compute the same value. Fixes PR24952. llvm-svn: 248690
* [InstCombine] fold zexts and constants into a phi (PR24766)Sanjay Patel2015-09-272-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is one step towards solving PR24766: https://llvm.org/bugs/show_bug.cgi?id=24766 We were not producing the same IR for these two C functions because the store to the temp bool causes extra zexts: #include <stdbool.h> bool switchy(char x1, char x2, char condition) { bool conditionMet = false; switch (condition) { case 0: conditionMet = (x1 == x2); break; case 1: conditionMet = (x1 <= x2); break; } return conditionMet; } bool switchy2(char x1, char x2, char condition) { switch (condition) { case 0: return (x1 == x2); case 1: return (x1 <= x2); } return false; } As noted in the code comments, this test case manages to avoid the more general existing phi optimizations where there are only 2 phi inputs or where there are no constant phi args mixed in with the casts ops. It seems like a corner case, but if we don't catch it, then I don't think we can get SimplifyCFG to further optimize towards the canonical form for this function shown in the bug report. Differential Revision: http://reviews.llvm.org/D12866 llvm-svn: 248689
* [EH] Create removeUnwindEdge utilityJoseph Tremoulet2015-09-273-99/+69
| | | | | | | | | | | | | | | | | Summary: Factor the code that rewrites invokes to calls and rewrites WinEH terminators to their "unwind to caller" equivalents into a helper in Utils/Local, and use it in the three places I'm aware of that need to do this. Reviewers: andrew.w.kaylor, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13152 llvm-svn: 248677
* [BranchProbability] Manually round the floating point output.Benjamin Kramer2015-09-261-28/+7
| | | | | | | | | | | llvm::format compiles down to snprintf which has no defined rounding for floating point arguments, and MSVC has implemented it differently from what the BSD libcs and glibc do. Try to emulate the glibc rounding behavior to avoid changing tests. While there simplify code a bit and move trivial methods inline. llvm-svn: 248665
* AMDGPU: Remove hasPostISelHook from most instructionsMatt Arsenault2015-09-262-12/+19
| | | | | | | Since this is only needed for VOP3 and a few other special case instructions, stop setting it on everything. llvm-svn: 248657
* AMDGPU: Switch over reg class size instead of checking all super classesMatt Arsenault2015-09-262-22/+36
| | | | | | This gets isSGPRClass out of my profile of SIFixSGPRCopies. llvm-svn: 248656
* AMDGPU: Don't handle invalid reg classes in helper functionsMatt Arsenault2015-09-261-6/+0
| | | | | | | No tests hit these and it would be better to have checks like this explicit where they are used. llvm-svn: 248655
* AMDGPU: address -Winconsistent-missing-overrideSaleem Abdulrasool2015-09-261-1/+1
| | | | | | Add missing override. NFC. llvm-svn: 248652
* AMDGPU: Set CopyCost of register classesMatt Arsenault2015-09-261-8/+32
| | | | | | | | These require multiple mov instructions to copy, but the default value is that 1 instruction is needed. I'm not sure if this actually changes anything. llvm-svn: 248651
* [Bug 24848] Use range metadata to constant fold comparisons between two valuesChen Li2015-09-261-0/+26
| | | | | | | | | | | | | | | Summary: This is the second part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. If both operands of a comparison have range metadata, they should be used to constant fold the comparison. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13177 llvm-svn: 248650
* AMDGPU: VOP3b definition cleanupsMatt Arsenault2015-09-262-18/+25
| | | | llvm-svn: 248647
* AMDGPU: Fix sched model for VOP2b instructionsMatt Arsenault2015-09-261-6/+7
| | | | | | | | Trying to use the version with the explicit output operand would complain because of the missing WriteSALU. I'm not sure why it doesn't complain about this with the implicit VCC def. llvm-svn: 248646
* [WebAssembly] Rename several functions and types according to the new spec.Dan Gohman2015-09-2610-158/+158
| | | | llvm-svn: 248644
* [ARM] Don't generate clrex for pre-v7 targets.Ahmed Bougacha2015-09-261-0/+2
| | | | | | Since r248294, we emit clrex, but it doesn't exist on v6. llvm-svn: 248640
* [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to exploit trip counts'Sanjoy Das2015-09-251-0/+16
| | | | | | | | | | | | | | | | | | | | | | Summary: If the trip count of a specific backedge is `N`, then we know that backedge is effectively guarded by the condition `{0,+,1} u< N`. This change teaches SCEV to use this condition to prove things in `isLoopBackedgeGuardedByCond`. Depends on D12948 Depends on D12949 The original checkin, r248608 had to be backed out due to an issue with a ObjCXX unit test. That issue is now fixed, so re-landing. Reviewers: atrick, reames, majnemer, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12950 llvm-svn: 248638
OpenPOWER on IntegriCloud