|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| | single attribute in the future.
llvm-svn: 170502 | 
| | 
| 
| 
| 
| 
| 
| | load / store pair. It's not legal to use a wider load than the size of
the remaining bytes if it's the first pair of load / store.
llvm-svn: 170018 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | mention the inline memcpy / memset expansion code is a mess?
This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.
llvm-svn: 169959 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.
llvm-svn: 169954 | 
| | 
| 
| 
| 
| 
| | f64 load / store on non-SSE2 x86 targets.
llvm-svn: 169944 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.
llvm-svn: 169929 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | 1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).
This significantly improves Dhrystone, up to 50% on ARM iOS devices.
rdar://12760078
llvm-svn: 169791 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).
rdar://12771555
llvm-svn: 169536 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.
define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}
This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr
The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.
rdar://12771555
llvm-svn: 169459 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | r165941: Resubmit the changes to llvm core to update the functions to
         support different pointer sizes on a per address space basis.
Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.
However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.
In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.
In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.
This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.
llvm-svn: 167222 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.
These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.
Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)
After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.
Summary of reverted revisions:
r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
         Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
         since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
         on the address space.
llvm-svn: 167221 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | checks to avoid performing compile-time arithmetic on PPCDoubleDouble.
Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.
llvm-svn: 166958 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.
llvm-svn: 166578 | 
| | 
| 
| 
| 
| 
| | different pointer sizes on a per address space basis.
llvm-svn: 165941 | 
| | 
| 
| 
| | llvm-svn: 165747 | 
| | 
| 
| 
| 
| 
| | per address space pointer sizes to be optimized correctly.
llvm-svn: 165726 | 
| | 
| 
| 
| 
| 
| 
| | We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.
llvm-svn: 165488 | 
| | 
| 
| 
| | llvm-svn: 165402 | 
| | 
| 
| 
| 
| 
| | No functionality change.
llvm-svn: 164924 | 
| | 
| 
| 
| 
| 
| | See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
llvm-svn: 164768 | 
| | 
| 
| 
| | llvm-svn: 164767 | 
| | 
| 
| 
| 
| 
| 
| | The hasFnAttr method has been replaced by querying the Attributes explicitly. No
intended functionality change.
llvm-svn: 164725 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.
llvm-svn: 163743 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | behaviour (converting NaN values between float and double).
SelectionDAG::getConstantFP(double Val, EVT VT, bool isTarget);
should not be used when Val is not a simple constant (as the comment in
SelectionDAG.h indicates). This patch avoids using this function
when folding an unknown constant through a bitcast, where it cannot be
guaranteed that Val will be a simple constant.
llvm-svn: 163703 | 
| | 
| 
| 
| 
| 
| | This folding happens as early as possible for performance reasons, and to make sure it isn't foiled by other transforms (e.g. forming FMAs).
llvm-svn: 163519 | 
| | 
| 
| 
| | llvm-svn: 163518 | 
| | 
| 
| 
| 
| 
| | concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types.
llvm-svn: 163511 | 
| | 
| 
| 
| 
| 
| 
| 
| | allocations (allocas). Allocas are known to be
disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics).
llvm-svn: 163299 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | These extra flags are not required to properly order the atomic
load/store instructions. SelectionDAGBuilder chains atomics as if they
were volatile, and SelectionDAG::getAtomic() sets the isVolatile bit on
the memory operands of all atomic operations.
The volatile bit is enough to order atomic loads and stores during and
after SelectionDAG.
This means we set mayLoad on atomic_load, mayStore on atomic_store, and
mayLoad+mayStore on the remaining atomic read-modify-write operations.
llvm-svn: 162733 | 
| | 
| 
| 
| 
| 
| | Reviewed offline by chandlerc.
llvm-svn: 162623 | 
| | 
| 
| 
| 
| 
| | various rounding modes.  Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807 | 
| | 
| 
| 
| 
| 
| 
| | This adds support for TargetIndex operands during isel. The meaning of
these (index, offset, flags) operands is entirely defined by the target.
llvm-svn: 161453 | 
| | 
| 
| 
| 
| 
| | loads from different x86 segments but the same address would get CSEd
llvm-svn: 160987 | 
| | 
| 
| 
| 
| 
| | No functionality change.
llvm-svn: 160501 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160305 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160229 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.
The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.
llvm-svn: 159312 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.
NV_CONTRIB
llvm-svn: 157479 | 
| | 
| 
| 
| | llvm-svn: 157195 | 
| | 
| 
| 
| | llvm-svn: 155957 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Instead of passing listener pointers to RAUW, let SelectionDAG itself
keep a linked list of interested listeners.
This makes it possible to have multiple listeners active at once, like
RAUWUpdateListener was already doing. It also makes it possible to
register listeners up the call stack without controlling all RAUW calls
below.
DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG
list of active listeners.
llvm-svn: 155248 | 
| | 
| 
| 
| 
| 
| 
| 
| | one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.
llvm-svn: 154447 | 
| | 
| 
| 
| 
| 
| | during instruction selection.
llvm-svn: 154113 | 
| | 
| 
| 
| 
| 
| 
| 
| | This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
llvm-svn: 154011 | 
| | 
| 
| 
| 
| 
| 
| 
| | This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.
llvm-svn: 153817 | 
| | 
| 
| 
| 
| 
| 
| 
| | Type legalization can zero-extend the elements of the build_vector node, so,
for example, we may have an <8 x i8> with i32 elements of value 255. That
should return 'true' for the vector being all ones.
llvm-svn: 153203 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | for workaround of g++-4.4's miscompilation.
It caused MSP430DAGToDAGISel::SelectIndexedBinOp() to be miscompiled.
When two ReplaceUses()'s are expanded as inline, vtable in base class is stored to latter (ISelUpdater)ISU.
llvm-svn: 152877 | 
| | 
| 
| 
| | llvm-svn: 152614 | 
| | 
| 
| 
| | llvm-svn: 152613 |