| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
Add support for TLS access for Windows on ARM. This generates a similar access
to MSVC for ARM.
The changes to the tablegen data is needed to support loading an external symbol
global that is not for a call. The adjustments to the DAG to DAG transforms are
needed to preserve the 32-bit move.
llvm-svn: 259676
|
| |
|
|
| |
llvm-svn: 259675
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to git bisect, this is the root cause of a miscompile for Regex in
libLLVMSupport. I am still working on reducing a test case.
The actual bug may be elsewhere and this commit just exposed it.
Anyway, at the moment, to reproduce, follow these steps:
1. Build clang and libLTO in release mode.
2. Create a new build directory <stage2> and cd into it.
3. Use clang and libLTO from #1 to build llvm-extract in Release mode + asserts
using -O2 -flto
4. Run llvm-extract -ralias '.*bar' -S test/Other/extract-alias.ll
Result:
program doesn't contain global named '.*bar'!
Expected result:
@a0a0bar = alias void ()* @bar
@a0bar = alias void ()* @bar
declare void @bar()
Note: In step #3, if you don't use lto or asserts, the miscompile disappears.
llvm-svn: 259674
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recommited, after some fixing with test cases.
Updated test cases:
test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
test/CodeGen/AArch64/tailcall_misched_graph.ll
Temporarily disabled test cases:
test/CodeGen/AMDGPU/split-vector-memoperand-offsets.ll
test/CodeGen/PowerPC/ppc64-fastcc.ll (partially updated)
test/CodeGen/PowerPC/vsx-fma-m.ll
test/CodeGen/PowerPC/vsx-fma-sp.ll
http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.
llvm-svn: 259673
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.
This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.
Differential Revision: http://reviews.llvm.org/D12090
llvm-svn: 259662
|
| |
|
|
|
|
|
|
|
|
| |
The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores
which happens to be a lot faster than __divsi3/__modsi3 when the core
has hardware divide instructions. Do the same here.
Fixes PR26450.
llvm-svn: 259657
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: MatzeB, qcolombet, jmolloy, mcrosier
Subscribers: llvm-commits, mcrosier
Differential Revision: http://reviews.llvm.org/D16806
llvm-svn: 259656
|
| |
|
|
| |
llvm-svn: 259655
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This regresses a test in LoopVectorize, so I'll need to go away and think about how to solve this in a way that isn't broken.
From the writeup in PR26071:
What's happening is that ComputeKnownZeroes is telling us that all bits except the LSB are zero. We're then deciding that only the LSB needs to be demanded from the icmp's inputs.
This is where we're wrong - we're assuming that after simplification the bits that were known zero will continue to be known zero. But they're not - during trivialization the upper bits get changed (because an XOR isn't shrunk), so the icmp fails.
The fault is in demandedbits - its contract does clearly state that a non-demanded bit may either be zero or one.
llvm-svn: 259649
|
| |
|
|
|
|
| |
Simple fix - Constant values were not being sign extended in FastIsel.
llvm-svn: 259645
|
| |
|
|
|
|
|
|
|
|
| |
MIPS ABI states that .sbss and .sdata sections must have SHF_MIPS_GPREL
flag. See Figure 4–7 on page 69 in the following document:
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf.
Differential Revision: http://reviews.llvm.org/D15740
llvm-svn: 259641
|
| |
|
|
|
|
|
|
|
|
|
|
| |
EltsFromConsecutiveLoads
Follow up to D16217 and D16729
This change uncovered an odd pattern where VZEXT_LOAD v4i64 was being lowered to a load of the lower v2i64 (so the 2nd i64 destination element wasn't being zeroed), I can't find any use/reason for this and have removed the pattern and replaced it so only the 1st i64 element is loaded and the upper bits all zeroed. This matches the description for X86ISD::VZEXT_LOAD
Differential Revision: http://reviews.llvm.org/D16768
llvm-svn: 259635
|
| |
|
|
| |
llvm-svn: 259631
|
| |
|
|
| |
llvm-svn: 259630
|
| |
|
|
|
|
|
|
|
|
| |
With this patch, the profile summary data will be available in indexed
profile data file so that profiler reader/compiler optimizer can start
to make use of.
Differential Revision: http://reviews.llvm.org/D16258
llvm-svn: 259626
|
| |
|
|
|
|
| |
is unused.
llvm-svn: 259625
|
| |
|
|
| |
llvm-svn: 259623
|
| |
|
|
| |
llvm-svn: 259621
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of PPCVSXFMAMutate is to elide copies by changing FMA forms
on PPC.
%vreg6<def> = COPY %vreg96
%vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg7
;v6 = v6 + v5 * v7
is replaced by
%vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg7, %vreg96
;v5 = v5 * v7 + v96
This was broken in the case where the target register was also used as a
multiplicand. Fix this case by checking for it and replacing both uses
with the copied register.
%vreg6<def> = COPY %vreg96
%vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg6
;v6 = v6 + v5 * v6
is replaced by
%vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg96, %vreg96
;v5 = v5 * v96 + v96
llvm-svn: 259617
|
| |
|
|
|
|
| |
Will re-implement based on review feedback.
llvm-svn: 259615
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The register coalescer can rematerialize constants that define
more of a register than the copy it is going to replace was going
to do.
This is valid in the case the register was undef before the
copy happened.
This patch makes sure that all the subranges defined by the new
rematerialization instructions have at least a dead def.
Review: http://reviews.llvm.org/D16693
llvm-svn: 259614
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
LoopVersioning is a transform utility that transform passes can use to
run-time disambiguate may-aliasing accesses. I'd like to also expose as
pass to allow it to be unit-tested.
I am planning to add support for non-aliasing annotation in
LoopVersioning and I'd like to be able to write tests directly using
this pass.
(After that feature is done, the pass could also be used to look for
optimization opportunities that are hidden behind incomplete alias
information at compile time.)
The pass drives LoopVersioning in its default way which is to fully
disambiguate may-aliasing accesses no matter how many checks are
required.
Reviewers: hfinkel, ashutosh.nema, sbaranga
Subscribers: zzheng, mssimpso, llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D16612
llvm-svn: 259610
|
| |
|
|
| |
llvm-svn: 259602
|
| |
|
|
|
|
|
| |
Strictly speaking, this is not an improvement in functionality per se
but a usability improvement to those debugging codeview.
llvm-svn: 259601
|
| |
|
|
| |
llvm-svn: 259600
|
| |
|
|
| |
llvm-svn: 259599
|
| |
|
|
|
|
|
|
|
| |
Please see include/llvm/Transforms/Utils/MemorySSA.h for a description
of MemorySSA, and what it does.
Differential Revision: http://reviews.llvm.org/D7864
llvm-svn: 259595
|
| |
|
|
|
|
| |
Due to staleness in a patch I committed yesterday, the debug output was reporting overdefined cases as being undefined. Confusing to say the least. The mistake appears to have only effected the debug output thankfully.
llvm-svn: 259594
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15625
llvm-svn: 259586
|
| |
|
|
|
|
| |
I introduced a declaration in 259583 to keep the diff readable. This change just moves the definition up to remove the declaration again.
llvm-svn: 259585
|
| |
|
|
|
|
|
|
| |
This patch uses the newly introduced 'intersect' utility (from 259461: [LVI] Introduce an intersect operation on lattice values) to simplify existing code in LVI.
While not introducing any new concepts, this change is probably not NFC. The common 'intersect' function is more powerful that the ad-hoc implementations we'd had in a couple of places. Given that, we may see optimizations triggering a bit more often.
llvm-svn: 259583
|
| |
|
|
|
|
|
|
| |
See comments in test/CodeGen/X86/avx-vzeroupper.ll for more explanation.
Original patch by: Sean Silva
llvm-svn: 259576
|
| |
|
|
|
|
|
|
| |
If we can't assume the pointer value isn't within the bounds
of the object, it seems risky to try to replace the pointer
calculations.
llvm-svn: 259573
|
| |
|
|
|
|
| |
Also add missing tests for the others.
llvm-svn: 259558
|
| |
|
|
|
|
|
|
|
| |
When the merging was involving LEAs, we were taking the wrong immediate
from the list of operands.
rdar://problem/24446069
llvm-svn: 259553
|
| |
|
|
| |
llvm-svn: 259552
|
| |
|
|
| |
llvm-svn: 259551
|
| |
|
|
|
|
| |
Mostly convert to use range loops.
llvm-svn: 259550
|
| |
|
|
|
|
|
|
|
| |
GenericIndirectStubsInfo.
This will allow architecture support classes for other architectures to re-use
this code.
llvm-svn: 259549
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
CodeView requires us to accurately describe the extent of the inlined
code. We did this by grabbing the next debug location in source order
and using *that* to denote where we stopped inlining. However, this is
not sufficient or correct in instances where there is no next debug
location or the next debug location belongs to the start of another
function.
To get this correct, use the end symbol of the function to denote the
last possible place the inlining could have stopped at.
llvm-svn: 259548
|
| |
|
|
| |
llvm-svn: 259547
|
| |
|
|
|
|
|
| |
We shouldn't crash on unhandled intrinsics.
Also simplify failure handling in loop.
llvm-svn: 259546
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When promoting allocas to LDS, we know we are indexing
into a specific area just created, and the calculation
will also never overflow.
Also emit some of the muls as nsw nuw, because instcombine
infers this already from the range metadata. I think
putting this on the other adds and muls might be OK too,
but I'm not 100% sure.
llvm-svn: 259545
|
| |
|
|
|
|
| |
Differential revision: http://reviews.llvm.org/D16793
llvm-svn: 259539
|
| |
|
|
|
|
|
|
| |
This directive emits the binary annotations that describe line and code
deltas in inlined call sites. Single-stepping through inlined frames in
windbg now works.
llvm-svn: 259535
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Enables eip-based addressing, e.g.,
lea constant(%eip), %rax
lea constant(%eip), %eax
in MC, (used for the x32 ABI). EIP-base addressing is also valid in x86_64,
it is left enabled for that architecture as well.
Patch by João Porto
Differential Revision: http://reviews.llvm.org/D16581
llvm-svn: 259528
|
| |
|
|
| |
llvm-svn: 259515
|
| |
|
|
| |
llvm-svn: 259510
|
| |
|
|
|
|
| |
The 3 programs used __attribute__((mode(?))) on enum, which clang r259497 fixed.
llvm-svn: 259508
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-commit of r258951 after fixing layering violation.
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.
There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.
llvm-svn: 259498
|