| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
No functionality change.
Reviewed by Tim Northover.
llvm-svn: 197172
|
| |
|
|
|
|
| |
Reviewed by Richard Sandiford.
llvm-svn: 197170
|
| |
|
|
|
|
| |
It means exactly the same and is just a bit shorter.
llvm-svn: 197169
|
| |
|
|
|
|
|
| |
GCC 4.7 changed the MingW ABI. On the LLVM side it means that sret functions
don't pop the stack.
llvm-svn: 197163
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Copy patterns with float/double types are enough.
- Fix typos in test case names that were using v1fx.
- There is no ACLE intrinsic that uses v1f32 type. And there is no conflict of
neon and non-neon ovelapped operations with this type, so there is no need to
support operations with this type.
- Remove v1f32 from FPR32 register and disallow v1f32 as a legal type for
operations.
Patch by Ana Pazos!
llvm-svn: 197159
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a vector packed single/double fp operation followed by a vector insert.
The effect is that the backend coverts the packed fp instruction
followed by a vectro insert into a SSE or AVX scalar fp instruction.
For example, given the following code:
__m128 foo(__m128 A, __m128 B) {
__m128 C = A + B;
return (__m128) {c[0], a[1], a[2], a[3]};
}
previously we generated:
addps %xmm0, %xmm1
movss %xmm1, %xmm0
we now generate:
addss %xmm1, %xmm0
llvm-svn: 197145
|
| |
|
|
| |
llvm-svn: 197136
|
| |
|
|
|
|
| |
scalar_to_vector of vector types having more than one element.
llvm-svn: 197135
|
| |
|
|
|
|
|
| |
I don't know why this did not show up earlier. This code has been
around for ages.
llvm-svn: 197119
|
| |
|
|
|
|
| |
x) ―> __exp10(x)
llvm-svn: 197109
|
| |
|
|
| |
llvm-svn: 197100
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aside from a few minor latency corrections, the major change here is a new
hazard recognizer which focuses on better dispatch-group formation on the
POWER7. As with the PPC970's hazard recognizer, the most important thing it
does is avoid load-after-store hazards within the same dispatch group. It uses
the POWER7's special dispatch-group-terminating nop instruction (instead of
inserting multiple regular nop instructions). This new hazard recognizer makes
use of the scheduling dependency graph itself, built using AA information, to
robustly detect the possibility of load-after-store hazards.
significant test-suite performance changes (the error bars are 99.5% confidence
intervals based on 5 test-suite runs both with and without the change --
speedups are negative):
speedups:
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
-0.55171% +/- 0.333168%
MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl
-17.5576% +/- 14.598%
MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl
-29.5708% +/- 7.09058%
MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt
-34.9471% +/- 11.4391%
SingleSource/Benchmarks/BenchmarkGame/puzzle
-25.1347% +/- 11.0104%
SingleSource/Benchmarks/Misc/flops-8
-17.7297% +/- 9.79061%
SingleSource/Benchmarks/Shootout-C++/ary3
-35.5018% +/- 23.9458%
SingleSource/Regression/C/uint64_to_float
-56.3165% +/- 25.4234%
SingleSource/UnitTests/Vectorizer/gcc-loops
-18.5309% +/- 6.8496%
regressions:
MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000
18.351% +/- 12.156%
SingleSource/Benchmarks/Shootout-C++/methcall
27.3086% +/- 14.4733%
llvm-svn: 197099
|
| |
|
|
|
|
| |
intrinsics to use f32 types, rather than their vector equivalents.
llvm-svn: 197090
|
| |
|
|
|
|
|
|
|
| |
For one predicate to subsume another, they must both check the same condition
register. Failure to check this prerequisite was causing miscompiles.
Fixes PR18003.
llvm-svn: 197089
|
| |
|
|
|
|
| |
use f32/f64 types, rather than their vector equivalents.
llvm-svn: 197068
|
| |
|
|
|
|
|
| |
floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.
llvm-svn: 197067
|
| |
|
|
|
|
|
|
| |
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.
llvm-svn: 197066
|
| |
|
|
| |
llvm-svn: 197064
|
| |
|
|
|
|
|
| |
This makes it a little easier to read.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197058
|
| |
|
|
|
|
|
| |
This enables -print-before-all to dump MachineInstrs after it is run.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197057
|
| |
|
|
|
|
|
| |
This enables -print-before-all to dump MachineInstrs after it is run.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197056
|
| |
|
|
| |
llvm-svn: 197052
|
| |
|
|
|
|
|
|
| |
The tests were no longer using fast-isel at all (MachO needs an "ios" rather
than "darwin" triple at the moment and Linux needs ARM mode). Once that was
corrected, the verifier complained about a t2ADDri created for the alloca.
llvm-svn: 197046
|
| |
|
|
|
|
|
|
|
| |
incompatible with GCC.
I moved a test from avx512-vbroadcast-crash.ll to avx512-vbroadcast.ll
I defined HasAVX512 predicate as AssemblerPredicate. It means that you should invoke llvm-mc with "-mcpu=knl" to get encoding for AVX-512 instructions. I need this to let AsmMatcher to set different encoding for AVX and AVX-512 instructions that have the same mnemonic and operands (all scalar instructions).
llvm-svn: 197041
|
| |
|
|
|
|
|
| |
In such cases it's often better to test the result of the negation instead,
since the negation also sets CC.
llvm-svn: 197032
|
| |
|
|
| |
llvm-svn: 196999
|
| |
|
|
| |
llvm-svn: 196998
|
| |
|
|
| |
llvm-svn: 196996
|
| |
|
|
| |
llvm-svn: 196990
|
| |
|
|
| |
llvm-svn: 196988
|
| |
|
|
| |
llvm-svn: 196987
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The combination of inline asm, stack realignment, and dynamic allocas
turns out to be too common to reject out of hand.
ASan inserts empy inline asm fragments and uses aligned allocas.
Compiling any trivial function containing a dynamic alloca with ASan is
enough to trigger the check.
XFAIL the test cases that would be miscompiled and add one that uses the
relevant functionality.
llvm-svn: 196986
|
| |
|
|
| |
llvm-svn: 196976
|
| |
|
|
| |
llvm-svn: 196971
|
| |
|
|
|
|
| |
.weak_def_can_be_hidden was not yet supported by the system assembler
llvm-svn: 196970
|
| |
|
|
|
|
| |
intrinsic to use f32/f64 types, rather than their vector equivalents.
llvm-svn: 196965
|
| |
|
|
|
|
|
|
| |
fixed-point
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.
llvm-svn: 196964
|
| |
|
|
|
|
| |
and fixed-point convert to floating-point LLVM AArch64 intrinsics.
llvm-svn: 196963
|
| |
|
|
|
|
| |
LLVM AArch64 intrinsics.
llvm-svn: 196962
|
| |
|
|
|
|
|
|
|
|
|
| |
This re-lands commit r196876, which was reverted in r196879.
The tests have been fixed to pass on platforms with a stack alignment
larger than 4.
Update to clang side tests will land shortly.
llvm-svn: 196939
|
| |
|
|
|
|
|
|
|
|
|
| |
Most users would be surprised if "isCOFF" and "isMachO" were simultaneously
true, unless they'd put the compiler in a box with a gun attached to a photon
detector.
This makes sure precisely one of the three formats is true for any triple and
simplifies some target logic based on that.
llvm-svn: 196934
|
| |
|
|
|
|
| |
that they use float/double rather than the vector equivalents when appropriate.
llvm-svn: 196930
|
| |
|
|
|
|
| |
Specifically, reuse the ARM intrinsics when possible.
llvm-svn: 196926
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediately after SSE scalar fp instructions like addss or mulss.
Added patterns to select SSE scalar fp arithmetic instructions from a scalar
fp operation followed by a blend.
For example, given the following code:
__m128 foo(__m128 A, __m128 B) {
A[0] += B[0];
return A;
}
previously we generated:
addss %xmm0, %xmm1
movss %xmm1, %xmm0
now we generate:
addss %xmm1, %xmm0
llvm-svn: 196925
|
| |
|
|
| |
llvm-svn: 196923
|
| |
|
|
| |
llvm-svn: 196922
|
| |
|
|
|
|
|
|
| |
Save S2(reg 18) only when we are calling floating point stubs that
have a return value of float or complex. Some more work to make this
better but this is the first step.
llvm-svn: 196921
|
| |
|
|
| |
llvm-svn: 196918
|
| |
|
|
| |
llvm-svn: 196914
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The result register of these instructions is also the first operand.
Reviewers: jacksprat, dsanders
Reviewed By: dsanders
Differential Revision: http://llvm-reviews.chandlerc.com/D2362
Differential Revision: http://llvm-reviews.chandlerc.com/D2363
llvm-svn: 196910
|