| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 169907
|
| |
|
|
|
|
| |
truncation is now done on scalars.
llvm-svn: 169904
|
| |
|
|
|
|
|
|
|
|
|
|
| |
because that method is only getting called for MCInstFragment. These
fragments aren't even generated when RelaxAll is set, which is why the
flag reference here is superfluous. Removing it simplifies the code
with no harmful effects.
An assertion is added higher up to make sure this path is never
reached.
llvm-svn: 169886
|
| |
|
|
| |
llvm-svn: 169881
|
| |
|
|
| |
llvm-svn: 169880
|
| |
|
|
|
|
|
|
| |
Use explicitely aligned store and load instructions to deal with argument and
retval shadow. This matters when an argument's alignment is higher than
__msan_param_tls alignment (which is the case with __m128i).
llvm-svn: 169859
|
| |
|
|
| |
llvm-svn: 169854
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of the instruction. I've left a forwarding wrapper for the
instruction so users with the instruction don't need to create
a GEPOperator themselves.
This lets us remove the copy of this code in instsimplify.
I've looked at most of the other copies of similar code, and this is the
only one I've found that is actually exactly the same. The one in
InlineCost is very close, but it requires re-mapping non-constant
indices through the cost analysis value simplification map. I could add
direct support for this to the generic routine, but it seems overly
specific.
llvm-svn: 169853
|
| |
|
|
|
|
|
|
|
|
|
| |
the GEP instruction class.
This is part of the continued refactoring and cleaning of the
infrastructure used by SROA. This particular operation is also done in
a few other places which I'll try to refactor to share this
implementation.
llvm-svn: 169852
|
| |
|
|
|
|
| |
instead of EVTs.
llvm-svn: 169851
|
| |
|
|
|
|
|
|
| |
MVTs, instead of EVTs.
Accordingly, add bitsLT (and similar) to MVT.
llvm-svn: 169850
|
| |
|
|
|
|
| |
from EVT.
llvm-svn: 169849
|
| |
|
|
|
|
| |
EVTs.
llvm-svn: 169848
|
| |
|
|
|
|
| |
EVTs.
llvm-svn: 169847
|
| |
|
|
|
|
| |
of EVT.
llvm-svn: 169845
|
| |
|
|
|
|
| |
instead of EVTs.
llvm-svn: 169844
|
| |
|
|
| |
llvm-svn: 169843
|
| |
|
|
|
|
| |
EVT.
llvm-svn: 169842
|
| |
|
|
| |
llvm-svn: 169841
|
| |
|
|
| |
llvm-svn: 169840
|
| |
|
|
| |
llvm-svn: 169839
|
| |
|
|
|
|
|
|
| |
EVT.
Accordingly, change RegDefIter to contain MVTs instead of EVTs.
llvm-svn: 169838
|
| |
|
|
|
|
|
|
|
| |
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.
This is the first, in a series of patches.
llvm-svn: 169837
|
| |
|
|
| |
llvm-svn: 169819
|
| |
|
|
| |
llvm-svn: 169814
|
| |
|
|
| |
llvm-svn: 169813
|
| |
|
|
| |
llvm-svn: 169811
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
try to reduce the width of this load, and would end up transforming:
(truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
(truncate (zextload i32 <ptr+4> as i64) to i32)
We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:
movswl 6(...),%eax
Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:
movl 6(...), %eax
llvm-svn: 169802
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles. Testing with the external/internal nightly
test suite reveals no change in compile time performance. Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures. All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.
In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.
Part of rdar://12553082
llvm-svn: 169796
|
| |
|
|
|
|
|
|
|
|
|
|
| |
controls each of the abbreviation sets (only a single one at the
moment) and computes offsets separately as well for each set
of DIEs.
No real function change, ordering of abbreviations for the skeleton
CU changed but only because we're computing in a separate order. Fix
the testcase not to care.
llvm-svn: 169793
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
if it's not possible to materialize an integer immediate with a single
instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
Also increase the threshold to something reasonable (8 for memset, 4 pairs
for memcpy).
This significantly improves Dhrystone, up to 50% on ARM iOS devices.
rdar://12760078
llvm-svn: 169791
|
| |
|
|
|
|
|
|
|
|
| |
Analyse Phis under the starting assumption that they are NoAlias. Recursively
look at their inputs.
If they MayAlias/MustAlias there must be an input that makes them so.
Addresses bug 14351.
llvm-svn: 169788
|
| |
|
|
|
|
|
|
| |
InitSections is called before the MCContext is initialized it could cause
duplicate temporary symbols to be emitted later (after context initialization
resets the temporary label counter).
llvm-svn: 169785
|
| |
|
|
| |
llvm-svn: 169780
|
| |
|
|
| |
llvm-svn: 169779
|
| |
|
|
| |
llvm-svn: 169776
|
| |
|
|
| |
llvm-svn: 169774
|
| |
|
|
| |
llvm-svn: 169773
|
| |
|
|
| |
llvm-svn: 169772
|
| |
|
|
| |
llvm-svn: 169771
|
| |
|
|
| |
llvm-svn: 169762
|
| |
|
|
|
|
| |
getMipsRegisterNumbering and use MCRegisterInfo::getEncodingValue instead.
llvm-svn: 169760
|
| |
|
|
|
|
| |
going on and makes a lot of the terminology in comments make more sense.
llvm-svn: 169758
|
| |
|
|
| |
llvm-svn: 169757
|
| |
|
|
| |
llvm-svn: 169756
|
| |
|
|
|
|
|
|
|
| |
The `-mno-red-zone' flag wasn't being propagated to the functions that code
coverage generates. This allowed some of them to use the red zone when that
wasn't allowed.
<rdar://problem/12843084>
llvm-svn: 169754
|
| |
|
|
|
|
|
| |
while (i--)
sum+=A[i];
llvm-svn: 169752
|
| |
|
|
|
|
|
|
|
| |
the assembler. This is useful in order to know how the numbers add up,
since in particular the Align fragments account for a non-trivial
portion of the emitted fragments (especially on -O0 which sets
relax-all).
llvm-svn: 169747
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
misched used GetUnderlyingObject in order to break false load/store
dependencies, and the -enable-aa-sched-mi feature similarly relied on
GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis.
Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so
(especially due to LSR) all of these mechanisms failed for
induction-variable-dependent loads and stores inside loops.
This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects
(which will recurse through phi and select instructions) in misched.
Andy reviewed, tested and simplified this patch; Thanks!
llvm-svn: 169744
|
| |
|
|
|
|
| |
Accidental commit... git svn betrayed me. Sorry for the noise.
llvm-svn: 169741
|