| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
We missed a handful of .mir tests that existed outside the
test/CodeGen/MIR directory.
Also fix the three powerpc .mir tests that nobody noticed were broken.
llvm-svn: 265350
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
eliminateCallFramePseudoInstr (PR27140)"
The original commit miscompiled things on 32-bit Windows, e.g. a Clang
boostrap. It turns out that mergeSPUpdates() was a bit too generous in
what it interpreted as a stack adjustment, causing the following code:
addl $12, %esp
leal -4(%ebp), %esp
To be "optimized" into simply:
addl $8, %esp
This commit tightens up mergeSPUpdates() and includes a new test
(test14 in movtopush.ll) for this situation.
llvm-svn: 265345
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds some dllexport tests to verify that:
- Variables in bss are exported appropriately
- Non-dllexport symbols aliased to dllexport symbols are not exported
- Symbols declared as dllexport but are not defined are not exported
We plan to enable dllimport/dllexport support for the PS4, and these
additional tests are for points we noticed in our internal testing.
Patch by Warren Ristow!
Differential Revision: http://reviews.llvm.org/D18682
llvm-svn: 265333
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can only perform a tail call to a callee that preserves all the
registers that the caller needs to preserve.
This situation happens with calling conventions like preserver_mostcc or
cxx_fast_tls. It was explicitely handled for fast_tls and failing for
preserve_most. This patch generalizes the check to any calling
convention.
Related to rdar://24207743
Differential Revision: http://reviews.llvm.org/D18680
llvm-svn: 265329
|
| |
|
|
|
|
| |
investigate.
llvm-svn: 265317
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.
Reviewers: qcolombet
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18525
llvm-svn: 265313
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
time issue. The patch is to solve PR17409 and its duplicates.
analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.
To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.
Differential Revision: http://reviews.llvm.org/D15302
llvm-svn: 265309
|
| |
|
|
|
|
|
|
|
|
|
| |
A cross-thread sequentially consistent fence should be lowered into
z/Architecture's BCR serialization instruction, instead of causing a
fatal error in the back-end.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18644
llvm-svn: 265292
|
| |
|
|
|
|
|
|
|
|
|
| |
Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which
previously would cause the back-end to crash. Currently, only a
frame count of zero is supported.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18514
llvm-svn: 265291
|
| |
|
|
|
|
|
|
|
| |
Implemented truncstore for KNL and skylake-avx512.
Covered vectors from v2i1 to v64i1. We save the value in bits (not in bytes) - v32i1 is saved in 4 bytes.
Differential Revision: http://reviews.llvm.org/D18740
llvm-svn: 265283
|
| |
|
|
| |
llvm-svn: 265267
|
| |
|
|
|
|
|
|
|
|
| |
Implemented load+{sign|zero}_extend for i1 vectors
Fixed failures in i1 vector load.
Covered loading of v2i1, v4i1, v8i1, v16i1, v32i1, v64i1 vectors for KNL and SKX.
Differential Revision: http://reviews.llvm.org/D18737
llvm-svn: 265259
|
| |
|
|
|
|
|
| |
Commit r264245 was the reason for failing tests in LLVM test suite.
Commit r264248 depends on the first one.
llvm-svn: 265249
|
| |
|
|
|
|
| |
More examples of PR22603, poor vector splitting for AVX512F targets as well as missing uses of PACKSS/MOVMSK
llvm-svn: 265248
|
| |
|
|
| |
llvm-svn: 265247
|
| |
|
|
| |
llvm-svn: 265222
|
| |
|
|
|
|
|
|
| |
We were producing ORR, which actually defines a GPR32sp rather than a GPR32.
Should fix PR23209.
llvm-svn: 265198
|
| |
|
|
| |
llvm-svn: 265192
|
| |
|
|
|
|
| |
float2double
llvm-svn: 265186
|
| |
|
|
| |
llvm-svn: 265185
|
| |
|
|
| |
llvm-svn: 265184
|
| |
|
|
| |
llvm-svn: 265183
|
| |
|
|
| |
llvm-svn: 265179
|
| |
|
|
| |
llvm-svn: 265173
|
| |
|
|
|
|
|
|
|
| |
Was there really no other way to splat a byte in SSE2?
punpcklbw {{.*#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7]
pshuflw {{.*#+}} xmm0 = xmm0[0,0,0,0,4,5,6,7]
pshufd {{.*#+}} xmm0 = xmm0[0,0,1,1]
llvm-svn: 265172
|
| |
|
|
| |
llvm-svn: 265171
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.
32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
Patch by: Vedran Miletić
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: jvesely, scchan, kanarayan, arsenm
Differential Revision: http://reviews.llvm.org/D17280
llvm-svn: 265170
|
| |
|
|
|
|
| |
Added SSE + AVX1 tests as well as AVX2
llvm-svn: 265169
|
| |
|
|
|
|
|
|
| |
Note however that this is identical to the existing SSE2 run.
What we really want is yet another run for an SSE2 machine that
also has fast unaligned 16-byte accesses.
llvm-svn: 265167
|
| |
|
|
| |
llvm-svn: 265164
|
| |
|
|
|
|
| |
Replaced lots of dodgy greps with actual codegen
llvm-svn: 265163
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Follow-up to http://reviews.llvm.org/D18566 and http://reviews.llvm.org/D18676 -
where we noticed that an intermediate splat was being generated for memsets of
non-zero chars.
That was because we told getMemsetStores() to use a 32-bit vector element type,
and it happily obliged by producing that constant using an integer multiply.
The 16-byte test that was added in D18566 is now equivalent for AVX1 and AVX2
(no splats, just a vector load), but we have PR27141 to track that splat difference.
Note that the SSE1 path is not changed in this patch. That can be a follow-up.
This patch should resolve PR27100.
llvm-svn: 265161
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Follow-up to D18566 - where we noticed that an intermediate splat was being
generated for memsets of non-zero chars.
That was because we told getMemsetStores() to use a 32-bit vector element type,
and it happily obliged by producing that constant using an integer multiply.
The tests that were added in the last patch are now equivalent for AVX1 and AVX2
(no splats, just a vector load), but we have PR27141 to track that splat difference.
In the new tests, the splat via shuffling looks ok to me, but there might be some
room for improvement depending on uarch there.
Note that the SSE1/2 paths are not changed in this patch. That can be a follow-up.
This patch should resolve PR27100.
Differential Revision: http://reviews.llvm.org/D18676
llvm-svn: 265148
|
| |
|
|
| |
llvm-svn: 265135
|
| |
|
|
|
|
|
|
| |
Add a new Intel MCU CPU Lakemont, which doesn't support X87.
Differential Revision: http://reviews.llvm.org/D18650
llvm-svn: 265128
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This changes some dllexport tests, to verify that some symbols that
should not be exported are not, in a way that improves the robustness
of CHECK-SAME interaction with CHECK-NOT.
We plan to enable dllimport/dllexport support for the PS4, and these
changes are for points we noticed in our internal testing.
Patch by Warren Ristow!
llvm-svn: 265106
|
| |
|
|
|
|
|
|
|
| |
Re-enable an assertion enabled by Justin Lebar in rL265092. rL265092
was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc
does not like non-i64 return types. Change the test case to not do
that.
llvm-svn: 265099
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, HandleLastUse would delete RegRef information for sub-registers
if they were dead even if their corresponding super-register were still live.
If the super-register were later renamed, then the definitions of the
sub-register would not be updated appropriately. This patch alters the
behavior so that RegInfo information for sub-registers is only deleted when
the sub-register and super-register are both dead.
This resolves PR26775. This is the mirror image of Hal's r227311 commit.
Author: Tom Jablin (tjablin)
Reviewers: kbarton uweigand nemanjai hfinkel
http://reviews.llvm.org/D18448
llvm-svn: 265097
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously the NVVMReflect pass would read its configuration from
command-line flags or a static configuration given to the pass at
instantiation time.
This doesn't quite work for clang's use-case. It needs to pass a value
for __CUDA_FTZ down on a per-module basis. We use a module flag for
this, so the NVVMReflect pass needs to be updated to read said flag.
Reviewers: tra, rnk
Subscribers: cfe-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D18672
llvm-svn: 265090
|
| |
|
|
| |
llvm-svn: 265081
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder
into DICompileUnit. DIBuilder is not the right place for this enum to live
in — a metadata consumer should not have to include DIBuilder.h.
I also added a Verifier check that checks that the emission kind of a
DICompileUnit is actually legal.
http://reviews.llvm.org/D18612
<rdar://problem/25427165>
llvm-svn: 265077
|
| |
|
|
| |
llvm-svn: 265067
|
| |
|
|
|
|
|
|
|
|
| |
Print aliases in topological order, that is, for any alias a = b,
b must be printed before a. This is because on some targets (e.g. PowerPC)
linker expects aliases in such an order to generate correct TOC information.
GCC also prints aliases in topological order.
llvm-svn: 265064
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change will allow loads with imp-def to be clustered in machine-scheduler pass.
areMemAccessesTriviallyDisjoint() can also handle loads with imp-def.
Reviewers: mcrosier, jmolloy, t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18665
llvm-svn: 265051
|
| |
|
|
|
|
| |
Removing unnecessary attributes and metadata...
llvm-svn: 265049
|
| |
|
|
| |
llvm-svn: 265048
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Chapter 3 of the QPX manual states that, "Scalar floating-point load
instructions, defined in the Power ISA, cause a replication of the source data
across all elements of the target register." Thus, if we have a load followed
by a QPX splat (from the first lane), the splat is redundant. This adds a late
MI-level pass to remove the redundant splats in some of these cases
(specifically when both occur in the same basic block).
This optimization is scheduled just prior to post-RA scheduling. It can't happen
before anything that might replace the load with some already-computed quantity
(i.e. store-to-load forwarding).
llvm-svn: 265047
|
| |
|
|
|
|
|
|
|
|
| |
eliminateCallFramePseudoInstr (PR27140)"
I think it might have caused these build breakages:
http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/7234/steps/build%20stage%202/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19566/steps/run%20tests/logs/stdio
llvm-svn: 265046
|
| |
|
|
|
|
| |
We don't really support non-constant shuffle masks, but these tests are for cases where BUILD_VECTOR is made up from vector extracts (as well as undef/zero scalars).
llvm-svn: 265045
|
| |
|
|
|
|
|
| |
The default is legal, which results in 'Cannot select' errors. This is
triggered during selfhost due to a recent cost model change.
llvm-svn: 265040
|