| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
D35067/rL308322 attempted to support up to 4 load pairs for memcmp inlining which resulted in regressions for some optimized libc memcmp implementations (PR33914).
Until we can match these more optimal cases, this patch reduces the memcmp expansion to a maximum of 2 load pairs (which matches what we do for -Os).
This patch should be considered for the 5.0.0 release branch as well
Differential Revision: https://reviews.llvm.org/D35830
llvm-svn: 308986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some SPARC TLS relocations were applying nontrivial adjustments
to zero value, leading to unexpected non-zero values in ELF and then
Solaris linker failures.
Getting rid of these adjustments.
Fixes PR33825.
Reviewers: rafael, asb, jyknight
Subscribers: joerg, jyknight, llvm-commits
Differential Revision: https://reviews.llvm.org/D35567
llvm-svn: 308978
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D35115
llvm-svn: 308972
|
| |
|
|
|
|
|
|
|
|
| |
style inline assembly statements.
Differential Revision:
https://reviews.llvm.org/D33277
https://reviews.llvm.org/D33278
llvm-svn: 308966
|
| |
|
|
|
|
|
|
|
|
| |
Enable runtime and partial loop unrolling of simple loops without
calls on M-class cores. The thresholds are calculated based on
whether the target is Thumb or Thumb-2.
Differential Revision: https://reviews.llvm.org/D34619
llvm-svn: 308956
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create a dummy 8 byte fixed object for the unused slot below the first
stored vararg.
Alternative ideas tested but skipped: One could try to align the whole
fixed object to 16, but I haven't found how to add an offset to the stack
frame used in LowerWin64_VASTART.
If only the size of the fixed stack object size is padded but not the offset, via
MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false),
PrologEpilogInserter crashes due to "Attempted to reset backwards range!".
This fixes misconceptions about where registers are spilled, since
AArch64FrameLowering.cpp assumes the offset from fixed objects is
aligned to 16 bytes (and the Win64 case there already manually aligns
the offset to 16 bytes).
This fixes cases where local stack allocations could overwrite callee
saved registers on the stack.
Differential Revision: https://reviews.llvm.org/D35720
llvm-svn: 308950
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
register when the two collides"
This reverts r308867 and r308866.
It broke the sanitizer-windows buildbot on C++ code similar to the
following:
namespace cl { }
void f() {
__asm {
mov al, cl
}
}
t.cpp(4,13): error: unexpected namespace name 'cl': expected expression
mov al, cl
^
In this case, MSVC parses 'cl' as a register, not a namespace.
llvm-svn: 308926
|
| |
|
|
| |
llvm-svn: 308914
|
| |
|
|
|
|
| |
Fine tune the resources in a couple of ASIMD loads.
llvm-svn: 308904
|
| |
|
|
|
|
|
|
| |
There's no need for these to be part of a class since
they are immediately replaced. New unreachable hit in
existing tests.'
llvm-svn: 308903
|
| |
|
|
|
|
|
|
|
|
|
| |
instructions that were missing.
patterns were missed by D33188. Adding for completion.
+Updating test.
Differential Revesion: https://reviews.llvm.org/D35179
llvm-svn: 308868
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the two collides
On MS-style, the following snippet:
int eax;
__asm mov eax, ebx
should yield loading of ebx, into the location pointed by the variable eax
This patch sees to it.
Currently, a reg-to-reg move would have been invoked.
clang: D34740
Differential Revision: https://reviews.llvm.org/D34739
llvm-svn: 308866
|
| |
|
|
|
|
|
|
| |
I have a much better way of running integration tests now.
https://github.com/dylanmckay/avr-test-suite
llvm-svn: 308857
|
| |
|
|
|
|
|
|
|
|
| |
optimization
Patch by Roland McGrath
Differential Revision: https://reviews.llvm.org/D35748
llvm-svn: 308854
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes unnecessary zero copies in BBs that are targets of b.eq/b.ne
and we know the result of the compare instruction is zero. For example,
BB#0:
subs w0, w1, w2
str w0, [x1]
b.ne .LBB0_2
BB#1:
mov w0, wzr ; <-- redundant
str w0, [x2]
.LBB0_2
Differential Revision: https://reviews.llvm.org/D35075
llvm-svn: 308849
|
| |
|
|
| |
llvm-svn: 308835
|
| |
|
|
|
|
|
|
| |
complexity adjustment to keep shift by immediate using the legacy instructions.
These patterns were only missing to favor using the legacy instructions when the shift was a constant. With careful adjustment of the pattern complexity we can make sure the immediate instructions still have priority over these patterns.
llvm-svn: 308834
|
| |
|
|
|
|
| |
compatibility.
llvm-svn: 308818
|
| |
|
|
|
|
| |
Fixes PR32805.
llvm-svn: 308817
|
| |
|
|
|
|
|
| |
All of the instructions were moved out of this a while ago,
so it's just a useless comment now.
llvm-svn: 308815
|
| |
|
|
|
|
|
|
| |
Bitrig code has been merged back to OpenBSD, thus the OS has been abandoned.
Differential Revision: https://reviews.llvm.org/D35707
llvm-svn: 308799
|
| |
|
|
|
|
|
|
| |
Reviewers: DavidKreitzer
Differential Revision: https://reviews.llvm.org/D35638
llvm-svn: 308784
|
| |
|
|
| |
llvm-svn: 308781
|
| |
|
|
|
|
|
|
|
|
| |
MIR SRADI uses instruction template XSForm_1rc which declares Defs = [CARRY]. But MIR SRADI_32 uses instruction template XSForm_1, and it doesn't declare such implicit definition. With patch D33720 it causes wrong code generation for perl.
This patch adds the implicit definition.
Differential Revision: https://reviews.llvm.org/D35699
llvm-svn: 308780
|
| |
|
|
|
|
| |
Testing is in the follow up change
llvm-svn: 308779
|
| |
|
|
|
|
|
|
|
|
| |
Fixes verifier errors in some call tests.
Not sure why we haven't run into this before.
Test split into separate patch for once
call support is committed.
llvm-svn: 308774
|
| |
|
|
|
|
|
| |
There are 2 more places doing this, but I'm not sure
what they are doing and don't make any sense to me
llvm-svn: 308770
|
| |
|
|
| |
llvm-svn: 308766
|
| |
|
|
| |
llvm-svn: 308762
|
| |
|
|
|
|
|
| |
For example
asm ("memw(%0++%1) = %2" : : "r"(addr),"a"(mod),"r"(val) : "memory")
llvm-svn: 308761
|
| |
|
|
|
|
|
|
|
|
|
|
| |
-membedded-data changes the location of constant data from the .sdata to
the .rodata section. Previously it was (incorrectly) always located in the
.rodata section.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D35686
llvm-svn: 308758
|
| |
|
|
|
|
| |
Omit atomics for now since they probably aren't useful.
llvm-svn: 308747
|
| |
|
|
|
|
|
|
|
| |
Follow up to r306280 in Clang.
Enable IAS by default for Android MIPS64 (uses N64 ABI).
Differential Revision: https://reviews.llvm.org/D35482
llvm-svn: 308742
|
| |
|
|
|
|
|
|
|
|
| |
See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591
Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm
Differential Revision: https://reviews.llvm.org/D35424
llvm-svn: 308740
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.
In order to achieve this, the following common code changes were made:
* New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
LSR should do instruction-based addressing evaluations by calling
isLegalAddressingMode() with the Instruction pointers.
* In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
not just loads or stores.
SystemZ changes:
* isLSRCostLess() implemented with Insns first, and without ImmCost.
* New function supportedAddressingMode() that is a helper for TTI methods
looking at Instructions passed via pointers.
Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049
llvm-svn: 308729
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently we only support (i32 bitcast(v32i1)) using the AVX2 VPMOVMSKB ymm instruction.
This patch adds support for splitting pre-AVX2 targets into 2 x (V)PMOVMSKB xmm instructions and merging the integer results.
In future we could probably generalize this to handle more cases.
Differential Revision: https://reviews.llvm.org/D35303
llvm-svn: 308723
|
| |
|
|
|
|
|
|
| |
movntdqa instruction.
The bitconverts here had an input type of 128-bits and an output type of 256 bits. The input type should also have been 256 bits.
llvm-svn: 308702
|
| |
|
|
|
|
|
| |
Add the cost for the EXT instructions and explicitly add the cost for a few
instructions that were implied by the coarse model.
llvm-svn: 308697
|
| |
|
|
|
|
|
|
| |
It revealed a bug in the Localizer pass which has now been fixed.
This includes the fix for SUBREG_TO_REG committed separately last time.
llvm-svn: 308688
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch adds support of i128 params lowering. The changes are quite trivial to
support i128 as a "special case" of integer type. With this patch, we lower i128
params the same way as aggregates of size 16 bytes: .param .b8 _ [16].
Currently, NVPTX can't deal with the 128 bit integers:
* in some cases because of failed assertions like
ValVTs.size() == OutVals.size() && "Bad return value decomposition"
* in other cases emitting PTX with .i128 or .u128 types (which are not valid [1])
[1] http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#fundamental-types
Differential Revision: https://reviews.llvm.org/D34555
Patch by: Denys Zariaiev (denys.zariaiev@gmail.com)
llvm-svn: 308675
|
| |
|
|
|
|
|
|
|
|
|
| |
Move the _RTN to the end of the name. It reads
better if the other addressing mode components
line up with the non-RTN version. It is also
more convenient to define saddr variants of
FLAT atomics to have the RTN last, and it is
good to have a consistent naming scheme.
llvm-svn: 308674
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On AMDGPU SGPR spills are really spilled to another register.
The spiller creates the spills to new frame index objects,
which is used as a placeholder.
This will eventually be replaced with a reference to a position
in a VGPR to write to and the frame index deleted. It is
most likely not a real stack location that can be shared
with another stack object.
This is a problem when StackSlotColoring decides it should
combine a frame index used for a normal VGPR spill with
a real stack location and a frame index used for an SGPR.
Add an ID field so that StackSlotColoring has a way
of knowing the different frame index types are
incompatible.
llvm-svn: 308673
|
| |
|
|
| |
llvm-svn: 308671
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: compnerd, ruiu, rnk, zturner
Reviewed By: rnk
Subscribers: majnemer, aemerson, aprantl, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D35518
llvm-svn: 308665
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also enable no-fsmuld for sparcv7 (which doesn't have the
instruction).
The previous code which used a post-processing pass to do this was
unnecessary; disabling the instruction is entirely sufficient.
Reviewers: jacob_hansen, ekedaigle
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35576
llvm-svn: 308661
|
| |
|
|
|
|
|
| |
This should eliminate most uses of countPopulation and Log2_32 on
the lane mask values.
llvm-svn: 308658
|
| |
|
|
|
|
| |
optimization for the 64-bit memory shifts.
llvm-svn: 308657
|
| |
|
|
| |
llvm-svn: 308655
|
| |
|
|
| |
llvm-svn: 308639
|
| |
|
|
| |
llvm-svn: 308638
|