| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
This is a fix for https://llvm.org/bugs/show_bug.cgi?id=29010
Root cause of the bug is that the register class of the machine instruction operand does not fully reflect if this registers that can be allocated.
Both for i386 and x86_64 the operand's register class is VR128RegClass and thus contains xmm0-xmm15, though in i386 we can only use xmm0-xmm8.
In order to get the actual allocable registers of the class we need to use RegisterClassInfo.
Differential Revision: https://reviews.llvm.org/D23613
llvm-svn: 278954
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactored so that a LSRUse owns its fixups, as oppsed to letting the
LSRInstance own them. This makes it easier to rate formulas for
LSRUses, since the fixups are available directly. The Offsets vector
has been removed since it was no longer necessary.
New target hook isFoldableMemAccessOffset(), which is used during formula
rating.
For SystemZ, this is useful to express that loads and stores with
float or vector types with a big/negative offset should be avoided in
loops. Without this, LSR will generate a lot of negative offsets that
would require extra instructions for loading the address.
Updated tests:
test/CodeGen/SystemZ/loop-01.ll
Reviewed by: Quentin Colombet and Ulrich Weigand.
https://reviews.llvm.org/D19152
llvm-svn: 278927
|
| |
|
|
|
|
|
| |
Replacing the usage of MVT with EVT in case the vector type is expanded.
Differential Revision: https://reviews.llvm.org/D23306
llvm-svn: 278913
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a quick work around, because in some cases, e.g. caller's stack
size > callee's stack size, we are still able to apply sibling call
optimization even callee has any byval arg.
This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328
Reviewers: hfinkel kbarton nemanjai amehsan
Subscribers: hans, tjablin
https://reviews.llvm.org/D23441
llvm-svn: 278900
|
| |
|
|
|
|
|
|
|
|
|
|
| |
If AnalyzeBranch can't analyze a block and it is possible to
fallthrough, then duplicating the block doesn't make sense, as only one
block can be the layout predecessor for the un-analyzable fallthrough.
Submitted wit a test case, but NOTE: the test case doesn't currently
fail. However, the test case fails with D20505 and would have saved me
some time debugging.
llvm-svn: 278866
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
for code size savings, for 64-bit constants.
This patch handles 64-bit constants which can be encoded as 32-bit immediates.
It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants.
Patch by Sunita Marathe!
Differential Revision: https://reviews.llvm.org/D23391
llvm-svn: 278857
|
| |
|
|
|
|
| |
Rename the operands to make the test less brittle.
llvm-svn: 278841
|
| |
|
|
|
|
|
|
|
|
| |
Do not reorder and move up a loop latch block before a loop header
when optimising for size because this will generate an extra
unconditional branch.
Differential Revision: https://reviews.llvm.org/D22521
llvm-svn: 278840
|
| |
|
|
|
|
|
|
|
|
|
| |
Check both operands for use of the $zero register which cannot be used with
a compact branch instruction.
Reviewers: dsanders, vkalintris
Differential Review: https://reviews.llvm.org/D23547
llvm-svn: 278824
|
| |
|
|
| |
llvm-svn: 278821
|
| |
|
|
| |
llvm-svn: 278810
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The pipeliner was generating an invalid Phi name for an operand
in the epilog block, which caused an assert in the live variable
analysis pass. The fix is to the code that generates new Phis
in the epilog block. In this case, there is an existing Phi that
needs to be reused rather than creating a new Phi instruction.
Differential Revision: https://reviews.llvm.org/D23513
llvm-svn: 278805
|
| |
|
|
|
|
| |
For now, no support for immediates.
llvm-svn: 278804
|
| |
|
|
|
|
| |
Using the same register means nothing was checking for operand order.
llvm-svn: 278803
|
| |
|
|
|
|
| |
And mark it as legal.
llvm-svn: 278802
|
| |
|
|
| |
llvm-svn: 278798
|
| |
|
|
| |
llvm-svn: 278796
|
| |
|
|
|
|
|
|
| |
byte rotations
The combine was only matching v2i64 as it assumed lowering to MOVQ - but we have v2f64 patterns that match in a similar fashion
llvm-svn: 278794
|
| |
|
|
|
|
| |
vectorize v64i8 shifts
llvm-svn: 278790
|
| |
|
|
| |
llvm-svn: 278787
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D21958
llvm-svn: 278782
|
| |
|
|
|
|
|
|
| |
Regression from r259791.
Differential Revision: https://reviews.llvm.org/D23374
llvm-svn: 278750
|
| |
|
|
|
|
|
|
|
| |
Use patterns instead of multiple instructions
Add buffer id to asm string
https://reviews.llvm.org/D22650
llvm-svn: 278749
|
| |
|
|
|
|
|
| |
Before we mischaracterized structs and i1 types as a scalar with size 0 in
various ways.
llvm-svn: 278744
|
| |
|
|
|
|
| |
Now the increment is done in a different location
llvm-svn: 278713
|
| |
|
|
| |
llvm-svn: 278676
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r278660.
It causes downstream assertion failure in InstCombine on shuffle
instructions. Comes up in __mm_swizzle_epi32.
llvm-svn: 278672
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new version has several advantages:
1) IMSHO it's more readable and neater
2) It handles loads and stores properly
3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.
With this change we can now finally sink load-modify-store idioms such as:
if (a)
return *b += 3;
else
return *b += 4;
=>
%z = load i32, i32* %y
%.sink = select i1 %a, i32 5, i32 7
%b = add i32 %z, %.sink
store i32 %b, i32* %y
ret i32 %b
When this works for switches it'll be even more powerful.
llvm-svn: 278660
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs!
Motivating testcase:
void f(float *a, float *b, float *c, int n) {
while (n-- > 0)
*c++ = *a++ + *b++;
}
It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage.
llvm-svn: 278658
|
| |
|
|
|
|
|
|
| |
be zero extended according to SPEC.
Differential Revision: http://reviews.llvm.org/D23489
llvm-svn: 278626
|
| |
|
|
| |
llvm-svn: 278624
|
| |
|
|
|
|
|
|
|
| |
1. Use shuffle to insert element i1 into vector. The previous implementation was incorrect ( dest_bit OR src_bit , it doesn't clear the bit if src_bit=0 )
2. Improve shuffle i1 vector, use CVT2MASK if supported instead TRUNCATE.
Differential Revision: http://reviews.llvm.org/D23347
llvm-svn: 278623
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r278287.
This commit broke the clang-cmake-thumbv7-a15-full-sh bot.
See https://llvm.org/bugs/show_bug.cgi?id=28949
llvm-svn: 278621
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r278288.
r278287 broke the clang-cmake-thumbv7-a15-full-sh bot.
Revert this so we can get to r278287.
llvm-svn: 278620
|
| |
|
|
|
|
|
|
|
|
|
| |
LowerTargetConstantPool is not properly setting the TargetFlag to indicate
desired relocation. Coding error, the offset parameter was omitted, so the
TargetFlag was used as the offset, and the TargetFlag defaulted to zero.
This only affects -fpic compilation, and only those items created in a
Constant Pool, for example a vector of constants. Halide ran into this issue.
llvm-svn: 278614
|
| |
|
|
|
|
| |
This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530.
llvm-svn: 278609
|
| |
|
|
|
|
| |
This reverts commit r276447.
llvm-svn: 278608
|
| |
|
|
|
|
|
| |
Tests are slightly modified versions of those written by
Sunita Marathe in D23391.
llvm-svn: 278599
|
| |
|
|
|
|
| |
patterns over more complex ones that produce better code.
llvm-svn: 278593
|
| |
|
|
|
|
|
|
| |
64-bit and 32-bit elements.
Fixes PR28961.
llvm-svn: 278592
|
| |
|
|
|
|
| |
Add test if the constant offset looks unaligned.
llvm-svn: 278589
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This test was resulting in asan/valgrind failures due to undefined
DWARF register mappings for WebAssembly, and was disabled in r278495.
These have been resolved.
Reviewers: sunfish, dschuff
Subscribers: bkramer, llvm-commits, jfb
Differential Revision: https://reviews.llvm.org/D23459
llvm-svn: 278576
|
| |
|
|
|
|
|
|
|
| |
Fixed a bug in the test case.
To fix PR28104, this patch restricts tail merging to blocks that belong to the
same loop after MBP.
llvm-svn: 278575
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This bring LLVM-generated PTX closer to what nvcc generates and avoids
triggering issues in ptxas.
For instance, ptxas does not accept .s16 (or .u16) registers as operands
for .fp16 instructions.
Differential Revision: https://reviews.llvm.org/D23460
llvm-svn: 278568
|
| |
|
|
|
|
|
|
|
|
| |
loads/stores.
The existing code accidentally skipped the aliasing check in edge cases.
Differential revision: https://reviews.llvm.org/D23372
llvm-svn: 278562
|
| |
|
|
|
|
|
|
| |
Trunk would try to create something like "stp x9, x8, [x0], #512", which isn't actually a valid instruction.
Differential revision: https://reviews.llvm.org/D23368
llvm-svn: 278559
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Currently X86ISelLowering has a similar transformation for sexts:
sext(add_nsw(x, C)) --> add(sext(x), C_sext)
In this change I extend this code to handle zexts as well.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D23359
llvm-svn: 278520
|
| |
|
|
|
|
|
|
|
|
|
| |
...and the two followup commits:
Revert "[Sparc][Leon] Missed resetting option flags from check-in 278489."
Revert "[Sparc][Leon] Errata fixes for various errata in different
versions of the Leon variants of the Sparc 32 bit processor."
This reverts commit r274856, r278489, and r278492.
llvm-svn: 278511
|
| |
|
|
|
|
| |
shifts
llvm-svn: 278502
|
| |
|
|
|
|
| |
It reads uninitialized memory and crashes randomly.
llvm-svn: 278495
|