| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 240835
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10708
llvm-svn: 240832
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10757
llvm-svn: 240831
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10706
llvm-svn: 240830
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This way the function symbol points to the start of amd_kernel_code_t
rather than the start of the function.
Reviewers: arsenm
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10705
llvm-svn: 240829
|
| |
|
|
|
|
|
|
|
|
| |
If we have a caller that knows a particular argument can never be null, we can exploit this fact while simplifying values in the inline cost analysis. This has the effect of reducing the cost for inlining when a null check is present in the callee, but the value is known non null in the caller. In particular, any dependent control flow can be discounted from the cost estimate.
Note that we use the parameter attributes at the call site to memoize the analysis within the caller's code. The setting of this attribute is done in InstCombine, the inline cost analysis just consumes it. This is intentional and important because we want the inline cost analysis results to be easily cachable themselves. We're not currently doing so, but initial results on LTO indicate this will quickly become important.
Differential Revision: http://reviews.llvm.org/D9129
llvm-svn: 240828
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If pseudoToMCOpcode failed, we would return the original opcode, so operands
would be swapped, but the instruction would remain the same.
It resulted in LSHLREV a, b ---> LSHLREV b, a.
This fixes Glamor text rendering and
piglit/arb_sample_shading-builtin-gl-sample-mask on VI.
This is a candidate for stable branches.
v2: the test was simplified by Tom Stellard
llvm-svn: 240824
|
| |
|
|
|
|
| |
This uses the new SDNode::op_values() iterator range committed in r240805.
llvm-svn: 240822
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
http://reviews.llvm.org/D10638
This is the back end portion of patch
http://reviews.llvm.org/D10637
It just adds the code gen and intrinsic functions necessary to support that patch to the back end.
llvm-svn: 240820
|
| |
|
|
|
|
|
| |
The body of the loops here only contained asserts. This triggered an unused variable
warning on release builds and -Werror on the bots.
llvm-svn: 240819
|
| |
|
|
|
|
| |
This uses the new SDNode::op_values() iterator range committed in r240805.
llvm-svn: 240817
|
| |
|
|
|
|
| |
This uses the new SDNode::op_values() iterator range committed in r240805.
llvm-svn: 240815
|
| |
|
|
| |
llvm-svn: 240813
|
| |
|
|
|
|
|
|
|
|
|
| |
X86WindowsTargetObjectFile::getSectionForConstant""
This reverts commit r240793 while fixing how we handle array constant
pool entries.
This fixes PR23966.
llvm-svn: 240811
|
| |
|
|
|
|
| |
This uses the new SDNode::op_values() iterator range committed in r240805.
llvm-svn: 240809
|
| |
|
|
|
|
|
|
|
|
| |
SDNode already had ops() which would iterate over the operands and return
SDUse*. This version instead gets the SDValue's out of the SDUse's so that
we can use foreach in more places.
Reviewed by David Blaikie.
llvm-svn: 240805
|
| |
|
|
| |
llvm-svn: 240804
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the error in ARM.td which stated that Cortex-R5
floating point unit can do only single precision, when it can do double as well.
Reviewers: rengolin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10769
llvm-svn: 240799
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Scalar evolution does not propagate the non-wrapping flags to values
that are derived from a non-wrapping induction variable because
the non-wrapping property could be flow-sensitive.
This change is a first attempt to establish the non-wrapping property in
some simple cases. The main idea is to look through the operations
defining the pointer. As long as we arrive to a non-wrapping AddRec via
a small chain of non-wrapping instruction, the pointer should not wrap
either.
I believe that this essentially is what Andy described in
http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way
forward.
Reviewers: aschwaighofer, nadav, sanjoy, atrick
Reviewed By: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10472
llvm-svn: 240798
|
| |
|
|
|
|
|
| |
The variable 'I' wasn't used when assertions were disabled.
This commit ensures that 'I' is used outside of an assert.
llvm-svn: 240797
|
| |
|
|
|
|
| |
Fixes a miscompilation of MultiSource/Benchmarks/MallocBench/gs
llvm-svn: 240796
|
| |
|
|
|
|
|
|
| |
Patch by Matthew Barney. Thanks!
Differential Revision: http://reviews.llvm.org/D9514
llvm-svn: 240795
|
| |
|
|
|
|
|
|
|
|
| |
VectorUtils.h non-static and defined out of line
Patch by Ashutosh Nema
Differential Revision: http://reviews.llvm.org/D10682
llvm-svn: 240794
|
| |
|
|
|
|
| |
It seems to have caused PR23966: "UNREACHABLE executed at ..\lib\Target\X86\X86TargetObjectFile.cpp:148"
llvm-svn: 240793
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit serializes machine basic block operands. The
machine basic block operands use the following syntax:
%bb.<id>[.<name>]
This commit also modifies the YAML representation for the
machine basic blocks - a new, required field 'id' is added
to the MBB YAML mapping.
The id is used to resolve the MBB references to the
actual MBBs. And while the name of the MBB can be
included in a MBB reference, this name isn't used to
resolve MBB references - as it's possible that multiple
MBBs will reference the same BB and thus they will have the
same name. If the name is specified, the parser will verify
that it is equal to the name of the MBB with the specified id.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10608
llvm-svn: 240792
|
| |
|
|
|
|
| |
Allows more aggressive folding of ashr/shl pairs.
llvm-svn: 240788
|
| |
|
|
|
|
|
| |
Instcombine also does this but many opportunities only become visible
after GEPs are lowered.
llvm-svn: 240787
|
| |
|
|
| |
llvm-svn: 240785
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This only adds support for ULW of an immediate address with/without a source register.
It does not include support for ULW of the address of a symbol.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9663
llvm-svn: 240782
|
| |
|
|
| |
llvm-svn: 240779
|
| |
|
|
| |
llvm-svn: 240778
|
| |
|
|
|
|
|
|
| |
This is still a really odd function. Most calls are in object format specific
contexts and should probably be replaced with a more direct query, but at least
now this is not too obnoxious to use.
llvm-svn: 240777
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Cortex-R4F TRM states that fpu supports both single and double precision.
This patch corrects the information in ARM.td file and corresponding test.
Reviewers: rengolin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10763
llvm-svn: 240776
|
| |
|
|
|
|
| |
No other format has this field.
llvm-svn: 240774
|
| |
|
|
|
|
| |
No need to create two symbols just to assign one to the other.
llvm-svn: 240773
|
| |
|
|
|
|
| |
No functionality changed, just keeping things clean.
llvm-svn: 240762
|
| |
|
|
|
|
| |
windows.
llvm-svn: 240760
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch also adds a function to calculate the cost of interleaved memory accesses.
E.g. Lower an interleaved load:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4
%v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
%v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
into:
%vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4)
%vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0
%vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1
E.g. Lower an interleaved store:
%i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4
into:
%sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
%sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
%sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4)
Differential Revision: http://reviews.llvm.org/D10533
llvm-svn: 240755
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
patch also adds a function to calculate the cost of interleaved memory accesses.
E.g. Lower an interleaved load:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
%v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
into:
%ld2 = { <4 x i32>, <4 x i32> } call llvm.aarch64.neon.ld2(%ptr)
%vec0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
%vec1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1
E.g. Lower an interleaved store:
%i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
store <12 x i32> %i.vec, <12 x i32>* %ptr
into:
%sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
%sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
%sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr)
Differential Revision: http://reviews.llvm.org/D10533
llvm-svn: 240754
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
memory accesses and transform into target specific intrinsics.
E.g. An interleaved load (Factor = 2):
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%v0 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <0, 2, 4, 6>
%v1 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <1, 3, 5, 7>
It can be transformed into a ld2 intrinsic in AArch64 backend or a vld2 intrinsic in ARM backend.
E.g. An interleaved store (Factor = 3):
%i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
store <12 x i32> %i.vec, <12 x i32>* %ptr
It can be transformed into a st3 intrinsic in AArch64 backend or a vst3 intrinsic in ARM backend.
Differential Revision: http://reviews.llvm.org/D10533
llvm-svn: 240751
|
| |
|
|
|
|
|
|
| |
Revert until http://llvm.org/PR23955 is investigated.
This reverts commit r239309.
llvm-svn: 240746
|
| |
|
|
|
|
|
|
| |
It can be more robust than copying debug info from first non-alloca
instruction in the entry basic block. We use the same strategy in
coverage instrumentation.
llvm-svn: 240738
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace the `std::vector<>` for `DIE::Children` with an intrusively
linked list. This is a strict memory improvement: it requires no
auxiliary storage, and reduces `sizeof(DIE)` by one pointer. It also
factors out the DIE-related malloc traffic.
This drops llc memory usage from 735 MB down to 718 MB, or ~2.3%.
(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)
llvm-svn: 240736
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change `DIE::Values` to a singly linked list, where each node is
allocated on a `BumpPtrAllocator`. In order to support `push_back()`,
the list is circular, and points at the tail element instead of the
head. I abstracted the core list logic out to `IntrusiveBackList` so
that it can be reused for `DIE::Children`, which also cares about
`push_back()`.
This drops llc memory usage from 799 MB down to 735 MB, about 8%.
(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)
llvm-svn: 240733
|
| |
|
|
| |
llvm-svn: 240727
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not instrument globals that are placed in sections containing "__llvm"
in their name.
This fixes a bug in ASan / PGO interoperability. ASan interferes with LLVM's
PGO, which places its globals into a special section, which is memcpy-ed by
the linker as a whole. When those goals are instrumented, ASan's memcpy wrapper
reports an issue.
http://reviews.llvm.org/D10541
llvm-svn: 240723
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It makes LLVM run out of registers even on 64-bit platforms. For example, the
following test case fails on darwin.
clang -cc1 -O0 -triple x86_64-apple-macosx10.10.0 -emit-obj -fsanitize=address -mstackrealign -o ~/tmp/ex.o -x c ex.c
error: inline assembly requires more registers than available
void TestInlineAssembly(const unsigned char *S, unsigned int pS, unsigned char *D, unsigned int pD, unsigned int h) {
unsigned int sr = 4, pDiffD = pD - 5;
unsigned int pDiffS = (pS << 1) - 5;
char flagSA = ((pS & 15) == 0),
flagDA = ((pD & 15) == 0);
asm volatile (
"mov %0, %%"PTR_REG("si")"\n"
"mov %2, %%"PTR_REG("cx")"\n"
"mov %1, %%"PTR_REG("di")"\n"
"mov %8, %%"PTR_REG("ax")"\n"
:
: "m" (S), "m" (D), "m" (pS), "m" (pDiffS), "m" (pDiffD), "m" (sr), "m" (flagSA), "m" (flagDA), "m" (h)
: "%"PTR_REG("si"), "%"PTR_REG("di"), "%"PTR_REG("ax"), "%"PTR_REG("cx"), "%"PTR_REG("dx"), "memory"
);
}
http://reviews.llvm.org/D10719
llvm-svn: 240722
|
| |
|
|
| |
llvm-svn: 240709
|
| |
|
|
|
|
|
| |
This allows user code to say Sym.getSize() instead of having to manually fetch
the object.
llvm-svn: 240708
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
r224810 fixed the handling of macro debug locations in AsmParser. This patch
fixes the logic to actually do what was intended: it uses the first macro of
the macro stack instead of the last one. The updated testcase shows that the
current scheme doesn't work when macro instanciations are nested and multiple
files are used.
Reviewers: compnerd
Differential Revision: http://reviews.llvm.org/D10463
llvm-svn: 240705
|