| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fold is done in IR by instcombine, and we have a special
form of it already here in DAGCombiner, but we want the more
general transform too:
https://rise4fun.com/Alive/3jZm
Name: general
Pre: (C1 + zext(C2) < 64)
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%a = and i64 %s2, zext((1 << (16 - C2)) - 1)
%r = trunc %a to i16
Name: special
Pre: C1 == 48
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%r = trunc %s2 to i16
...because D58017 exposes a regression without this fold.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If the pointer was loaded/stored before the null check, the check
is redundant and can be removed. For now the optimizers do not
remove the nullptr check, see https://gcc.godbolt.org/z/H2r5GG.
The patch allows to use more nonnull constraints. Also, it found
one more optimization in some PowerPC test. This is my first llvm
review, I am free to any comments.
Differential Revision: https://reviews.llvm.org/D71177
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
PowerPC has instruction to do the semantics of this piece of code:
vector int foo(vector int m, vector int n) {
return (m + n + 1) >> 1;
}
This patch is adding the match rule to select it.
Differential Revision: https://reviews.llvm.org/D71002
|
|
|
|
|
|
|
|
|
|
|
|
| |
the last csect of a sections
SUMMARY:
Fixed a bug of XCOFFObjectFile.cpp when there is padding at the last csect of a sections.
when there is a tail padding of a section, but the value of CurrentAddressLocation do not be increased by the padding size. it will hit assert assert(CurrentAddressLocation == Section->Address && "We should have no padding between sections.");
Reviewers: daltenty,hubert.reinterpretcast,
Differential Revision: https://reviews.llvm.org/D70859
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is found during https://reviews.llvm.org/D70758
All the other record forms are having suffix o at the end.
ANDIo8 and ANDISo8 are the only two that put o before 8.
This patch rename them to be consistent with others.
Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg
Reviewed By: jhibbits
Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70928
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap.
Specifically, the pass can combine a splat/swap to a splat/copy. It uses `TargetRegisterInfo::lookThruCopyLike` to determine that the operands to the splat are the same. However, the current logic only compares the operands based on register numbers. In the case where the splat operands are ultimately feed from the same physical register, the pass can incorrectly remove a swap if the feed register for one of the operands has been clobbered.
This patch adds a check to ensure that the registers feeding are both virtual registers or the operands to the splat or swap are both the same register.
Here is an example in pseudo-MIR of what happens in the test cased added in this patch:
Before PPC MI peephole optimization:
```
%arg = XVADDDP %0, %1
$f1 = COPY %arg.sub_64
call double rint(double)
%res.first = COPY $f1
%vec.res.first = SUBREG_TO_REG 1, %res.first, %subreg.sub_64
%arg.swapped = XXPERMDI %arg, %arg, 2
$f1 = COPY %arg.swapped.sub_64
call double rint(double)
%res.second = COPY $f1
%vec.res.second = SUBREG_TO_REG 1, %res.second, %subreg.sub_64
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
%vec.res = XXPERMDI %vec.res.splat, %vec.res.splat, 2
; %vec.res == [ %vec.res.second[0], %vec.res.first[0] ]
```
After optimization:
```
; ...
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
; lookThruCopyLike(%vec.res.first) == lookThruCopyLike(%vec.res.second) == $f1
; so the pass replaces the swap with a copy:
%vec.res = COPY %vec.res.splat
; %vec.res == [ %vec.res.first[0], %vec.res.second[0] ]
```
As best as I can tell, this has occurred since r288152, which added support for lowering certain vector operations to direct moves in the form of a splat.
Committed for vddvss (Colin Samples). Thanks Colin for the patch!
Differential Revision: https://reviews.llvm.org/D69497
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead.
To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks:
make sure there is at least one duplication in current work set.
the number of duplication should not exceed the number of successors.
The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain.
Differential Revision: https://reviews.llvm.org/D64376
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xcoffobject file
SUMMARY:
in the patch https://reviews.llvm.org/D66969 . we need a test case to verify the out text section of the xcoffobject file is correct or not.
but we do not have llvm disassembly tools to dump the xcoffobjectfile . since we commit the patch https://reviews.llvm.org/D70255, we have tools for it. we create this test case for it.
Reviewers: daltenty,hubert.reinterpretcast,
Differential Revision: https://reviews.llvm.org/D70719
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Previously we only handled the case where the csect hadn't been set up yet, so we'd hit an assert later on.
Reviewers: jasonliu, DiggerLin, stevewan
Reviewed By: jasonliu
Subscribers: hubert.reinterpretcast, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71032
|
|
|
|
|
|
|
|
|
|
| |
propagation.
Fix assertion error
```
bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed.
```
by checking if the register is 0 before invoking `isRenamable`.
|
|
|
|
|
|
|
| |
propagation"
This reverts commit 75b3a1c318ccad0f96c38689279bc5db63e2ad05, since it
breaks bootstrap build.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch mainly do such transformation
```
$R0 = OP ...
... // No read/clobber of $R0 and $R1
$R1 = COPY $R0 // $R0 is killed
```
Replace $R0 with $R1 and remove the COPY, we have
```
$R1 = OP ...
```
This transformation can also expose more opportunities for existing
copy elimination in MCP.
Differential Revision: https://reviews.llvm.org/D67794
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implement emitTCEntry for PPCTargetXCOFFStreamer.
Add TC csects to TOCCsects for object file writing.
Note:
1. I did not include any raw data testing for this object file generation
because TC entries raw data will all be 0 without relocation implemented.
I will add raw data testing as part of relocation testing later.
2. I removed "Symbol->setFragment(F);" for common symbols because we
don't need it, and if we have it then we would hit assertions below:
Assertion `(SymbolContents == SymContentsUnset ||
SymbolContents == SymContentsOffset) &&
"Cannot get offset for a common/variable symbol"' failed.
3.Fixed incorrect TOC-base alignment.
Differential Revision: https://reviews.llvm.org/D70798
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example:
x3 = rlwinm x3, 27, 5, 31
x3 = rlwinm x3, 19, 0, 12
can be combined to
x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang, lkail
Differential Revision: https://reviews.llvm.org/D70374
|
|
|
|
|
|
|
|
| |
This is an alternative to D64662 that shares more code between
strict and non-strict nodes. It's modeled after the implementation
that I did for softening.
Differential Revision: https://reviews.llvm.org/D70867
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loops."
Summary: b19ec1eb3d0c has been reverted because of the test failures
with PowerPC targets. This patch addresses the issues from the previous
commit.
Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll
and CodeGen/PowerPC/sms-cpy-1.ll pass
Subscribers: llvm-commits
|
|
|
|
|
|
|
|
|
|
|
|
| |
When converting reg+reg shifts to reg+imm rotates, we neglect to consider the
CodeGenOnly versions of the 32-bit shift mnemonics. This means we produce a
rotate with missing operands which causes a crash.
Committing this fix without review since it is non-controversial that the list
of mnemonics to consider should include the 64-bit aliases for the exact
mnemonics.
Fixes PR44183.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds LowerFormalArguments_AIX, support is added for lowering
int, float, and double formal arguments into general purpose and
floating point registers only.
The aix calling convention testcase have been redone to test for caller
and callee functionality in the same lit test.
Patch by Zarko Todorovski!
Differential Revision: https://reviews.llvm.org/D69578
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Emit the correct .toc psuedo op when we change to the TOC and emit
TC entries. Make sure TOC psuedos get the right symbols via overriding
getMCSymbolForTOCPseudoMO on AIX. Add a test for TOC assembly writing
and update tests to include TOC entries.
Also make sure external globals have a csect set and handle external function descriptor (originally authored by Jason Liu) so we can emit TOC entries for them.
Reviewers: DiggerLin, sfertile, Xiangling_L, jasonliu, hubert.reinterpretcast
Reviewed By: jasonliu
Subscribers: arphaman, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70461
|
|
|
|
|
|
|
|
|
|
| |
This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC backend. At this point the patch simply
assumes that a future CPU will have the same characteristics as pwr9. Those
characteristics may change with later patches.
Differential Revision: https://reviews.llvm.org/D70333
|
|
|
|
|
|
|
|
|
|
| |
Afer https://reviews.llvm.org/D67088, PPCLoopPreIncPrep pass can prepare more instruction forms except pre inc form, like DS/DQ forms.
This patch is a follow-up of https://reviews.llvm.org/D67088 to rename the pass name.
Reviewed by: jsji
Differential Revision: https://reviews.llvm.org/D70371
|
|
|
|
|
|
| |
This is a follow up commit to address post-commit comment in D70443
Differential revision: https://reviews.llvm.org/D70443
|
|
|
|
|
|
|
|
| |
If an inline asm statement clobbers a VSX register that overlaps with a
callee-saved Altivec register or FPR, we will not record the clobber and will
therefore violate the ABI. This is clearly a bug so this patch fixes it.
Differential revision: https://reviews.llvm.org/D68576
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
call
Summary:
This patch sets up the infrastructure for
1. Associate MCSymbolXCOFF with an MCSectionXCOFF when it could not
get implicitly associated.
2. Generate undefined symbols. The patch itself generates undefined symbol
for external function call only. Generate undefined symbol for external
global variable and external function descriptors will be handled in
separate patch(s) after this is land.
Differential Revision: https://reviews.llvm.org/D70443
|
| |
|
|
|
|
| |
This reverts commit 29f6f9b2b2bfecccf903738e2f5a0cd0a70fce31.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch aims to spill CR[0-7]LT bits on POWER9 using the setb instruction.
The sequence on P9 to spill these bits will be:
setb %reg, %CRREG
stw %reg, $FI
Instead of the typical sequence:
mfocrf %reg, %CRREG
rlwinm %reg1, %reg, $SH, 0, 0
stw %reg1, $FI
Differential Revision: https://reviews.llvm.org/D68443
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is a follow up on read-only assembly patch D70182.
It intends to enable object file generation for the read-only data section on AIX.
Reviewers: DiggerLin, daltenty
Differential Revision: https://reviews.llvm.org/D70455
|
|
|
|
|
|
|
| |
Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type.
Mark it as legal and add the match pattern.
Differential Revision: https://reviews.llvm.org/D69601
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
combine
x3 = rlwinm x3, 27, 5, 31
x3 = rlwinm x3, 19, 0, 12
to
x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang
Differential Revision: https://reviews.llvm.org/D70374
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
strings
This patch adds support for generating assembly code for one-byte mergeable strings.
Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later.
Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L
Reviewed by: daltenty
Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70310
|
|
|
|
|
|
|
|
|
|
| |
This patch lowering jump table, constant pool and block address in assembly.
1. On AIX, jump table index is always relative;
2. Put CPI and JTI into ReadOnlySection until we support unique data sections;
3. Create the temp symbol for block address symbol;
4. Update MIR testcases and add related assembly part;
Differential Revision: https://reviews.llvm.org/D70243
|
|
|
|
|
|
| |
This patch implements writing function descriptors and TOC base into
data section, and also add function descriptors(both csect and label)
and TOC base symbols to the symbol table.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
SUMMARY:
Adding a test case for read-only data assembly writing for aix
Reviewers: daltenty,Xiangling_Liao
Subscribers: rupprecht, seiyai,hiraditya
Differential Revision: https://reviews.llvm.org/D70182
|
|
|
|
|
|
|
|
|
|
| |
This patch aims to improve the code generation for float vector gather on POWER9.
Patterns have been implemented to utilize instructions that deliver improved
performance.
Patch by: Kamau Bridgeman
Differential Revision: https://reviews.llvm.org/D62908
|
|
|
|
|
|
|
|
|
|
| |
Test case to verify that the expected code is generated for a
vector float gather based on the patterns in tablegen for big
and little endian cases.
Patch by: Kamau Bridgeman
Differential Revision: https://reviews.llvm.org/D69443
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Now, PPCPreIncPrep pass changes a loop to update form and update all load/store
with same base accordingly. We can do more for load/store with same base, for
example, convert load/store with same base to ds/dq form.
Reviewed by: jsji
Differential Revision: https://reviews.llvm.org/D67088
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is found during review of https://reviews.llvm.org/D67088.
CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106.
-allow-deprecated-dag-overlap was introduced to temporary accept old
behavior.
But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll`
The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`.
This patch remove the deprecated options, and fix the broken tests.
Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz
Reviewed By: shchenz
Subscribers: shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69733
|
|
|
|
|
|
| |
The author of the patch forgot to add -verify-machineinstrs to the RUN
lines which would have made the issue appear on all bots. Added that
as well as a fix for the undefined register issue (after the hoisting).
|
|
|
|
|
|
|
|
| |
For XCOFF, globals mapped into the .bss section are linked as COMMON
definitions. This behaviour is incorrect for zero initialized data, so
emit those to the .data section instead.
Differential Revision: https://reviews.llvm.org/D69528
|
|
|
|
| |
by adding powerpc triple
|
|
|
|
|
|
|
|
|
| |
In current Hoist() function of machine licm pass, it will not check the source and destination basic block frequencies that a instruction is hoisted from/to.
There is a chance that instruction is hoisted from a cold to a hot basic block.
In this patch, we add options to disable machine instruction hoisting if destination block is hotter.
Differential Revision: https://reviews.llvm.org/D63676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example:
long long test(long long a, long long b) {
if (a << b > 0)
return b;
if (a << b < 0)
return a;
return a*b;
}
Produces:
sld. 5, 3, 4
ble 0, .LBB0_2
mr 3, 4
blr
.LBB0_2: # %if.end
cmpldi 5, 0
li 5, 1
isel 4, 4, 5, 2
mulld 3, 4, 3
blr
But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already
contains the result of that comparison).
The root cause of this is that LLVM converts signed comparisons into equality
comparison based on dominance. Equality comparisons are unsigned by default, so
we get either a record-form or cmp (without the l for logical) feeding a cmpl.
That is the situation we want to avoid here.
Differential Revision: https://reviews.llvm.org/D60506
|