| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
Summary:
Currently, we set legalization action of `ISD::ROTL` vectors as
`Expand` in `PPCISelLowering`. However, we can exploit `vrl(b|h|w|d)`
to lower `ISD::ROTL` directly.
Differential Revision: https://reviews.llvm.org/D71324
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
static void *ifunc(void) __attribute__((ifunc("resolver")));
void foo() { ifunc(); }
The relocation produced by the ifunc() call:
1. gcc -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
2. gcc -msecure-plt -PIE => R_PPC_PLTREL24 r_addend=0x8000
3. clang -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
4. clang -msecure-plt -fPIE => R_PPC_REL24
4 is incorrect. The R_PPC_REL24 needs a call stub due to ifunc. If this
relocation is mixed with other R_PPC_PLTREL24(r_addend=0x8000) in a
function, both GNU ld and lld (after D71621 fix) may produce a wrong
result.
This patch fixes 4 to use R_PPC_PLTREL24, which matches GCC.
Both GNU ld and lld (after D71621) will be happy.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D71649
|
|
|
|
|
|
|
|
|
| |
Summary:
When we use undefined symbol with its qualname, we are not able
to generate that symbol because of the logic of early "continue"
that skip the qualname symbol. This patch fixes it.
Differential revision: https://reviews.llvm.org/D71667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The default static (non-PIC, non-PIE) model for 32-bit powerpc does not
use @PLT annotations and relocations in GCC. LLVM shouldn't use @PLT
annotations either, because it breaks secure-PLT linking with (some
versions of?) GNU LD.
Update the available-externally.ll test to reflect that default mode should be
the same as the static relocation, by using the same check prefix.
Reviewed by: sfertile
Differential Revision: https://reviews.llvm.org/D70570
|
|
|
|
|
|
|
|
| |
Fix a FIXME in ppcloopinstrformprep pass.
Reviewed by: nemanjai
Differential Revision: https://reviews.llvm.org/D71346
|
|
|
|
|
|
|
|
| |
We somehow missed doing this when we were working on Power9 exploitation.
This just adds the missing legalization and cost for producing the vector
intrinsics.
Differential revision: https://reviews.llvm.org/D70436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If a function is defined after it appears in a TOC expression, we may
try to access an unset containing csect when returning a symbol for the
expression.
Reviewers: Xiangling_L, DiggerLin, jasonliu, hubert.reinterpretcast
Reviewed By: hubert.reinterpretcast
Subscribers: hubert.reinterpretcast, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71125
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following intrinsics currently carry a rounding mode metadata argument:
llvm.experimental.constrained.minnum
llvm.experimental.constrained.maxnum
llvm.experimental.constrained.ceil
llvm.experimental.constrained.floor
llvm.experimental.constrained.round
llvm.experimental.constrained.trunc
This is not useful since the semantics of those intrinsics do not in any way
depend on the rounding mode. In similar cases, other constrained intrinsics
do not have the rounding mode argument. Remove it here as well.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D71218
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r372285 changed LLVM to use a `TargetConstant` for parameters of intrinsics that are required to be immediates.
Since that commit, use of `%llvm.ppc.altivec.vc{fsx,fux,tsxs,tuxs}` intrinsics has not worked, and resulted in a `LLVM ERROR: Cannot select: intrinsic %llvm.ppc.altivec.vc*` error. The intrinsics' TableGen definitions matched on `imm` instead of `timm`.
This commit updates those definitions to use `timm`.
Fixes: https://llvm.org/PR44239
Reviewers: hfinkel, nemanjai, #powerpc, Jim
Reviewed By: Jim
Subscribers: qiucf, wuzish, Jim, hiraditya, kbarton, jsji, shchenz, llvm-commits
Tags: #llvm
Patched by vddvss (Colin Samples).
Differential Revision: https://reviews.llvm.org/D71138
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extends the desciptor-based indirect call support for 32-bit codegen,
and enables indirect calls for AIX.
In-depth Description:
In a function descriptor based ABI, a function pointer points at a
descriptor structure as opposed to the function's entry point. The
descriptor takes the form of 3 pointers: 1 for the function's entry
point, 1 for the TOC anchor of the module containing the function
definition, and 1 for the environment pointer:
struct FunctionDescriptor {
void *EntryPoint;
void *TOCAnchor;
void *EnvironmentPointer;
};
An indirect call has several steps of loading the the information from
the descriptor into the proper registers for setting up the call. Namely
it has to:
1) Save the caller's TOC pointer into the TOC save slot in the linkage
area, and then load the callee's TOC pointer into the TOC register
(GPR 2 on AIX).
2) Load the function descriptor's entry point into the count register.
3) Load the environment pointer into the environment pointer register
(GPR 11 on AIX).
4) Perform the call by branching on count register.
5) Restore the caller's TOC pointer after returning from the indirect call.
A couple important caveats to the above:
- There is no way to directly load a value from memory into the count register.
Instead we populate the count register by loading the entry point address into
a gpr and then moving the gpr to the count register.
- The TOC restore has to come immediately after the branch on count register
instruction (i.e., the 1st instruction executed after we return from the
call). This is an implementation limitation. We could, in theory, schedule
the restore elsewhere as long as no uses of the TOC pointer fall in between
the call and the restore; however, to keep it simple, we insert a pseudo
instruction that represents both the indirect branch instruction and the
load instruction that restores the caller's TOC from the linkage area. As
they flow through the compiler as a single pseudo instruction, nothing can be
inserted between them and the caller's TOC is then valid at any use.
Differtential Revision: https://reviews.llvm.org/D70724
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial attempt (rG89633320) botched the logic by reversing
the source/dest types. Added x86 tests for additional coverage.
The vector tests show a potential improvement (fold vector load
instead of broadcasting), but that's a known/existing problem.
This fold is done in IR by instcombine, and we have a special
form of it already here in DAGCombiner, but we want the more
general transform too:
https://rise4fun.com/Alive/3jZm
Name: general
Pre: (C1 + zext(C2) < 64)
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%a = and i64 %s2, zext((1 << (16 - C2)) - 1)
%r = trunc %a to i16
Name: special
Pre: C1 == 48
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%r = trunc %s2 to i16
...because D58017 exposes a regression without this fold.
|
|
|
|
|
| |
This reverts commit 8963332c3327daa652ba3e26d35f9109b6991985.
There was a logic bug typo in this code, but it wasn't visible in the asm for the tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fold is done in IR by instcombine, and we have a special
form of it already here in DAGCombiner, but we want the more
general transform too:
https://rise4fun.com/Alive/3jZm
Name: general
Pre: (C1 + zext(C2) < 64)
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%a = and i64 %s2, zext((1 << (16 - C2)) - 1)
%r = trunc %a to i16
Name: special
Pre: C1 == 48
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%r = trunc %s2 to i16
...because D58017 exposes a regression without this fold.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If the pointer was loaded/stored before the null check, the check
is redundant and can be removed. For now the optimizers do not
remove the nullptr check, see https://gcc.godbolt.org/z/H2r5GG.
The patch allows to use more nonnull constraints. Also, it found
one more optimization in some PowerPC test. This is my first llvm
review, I am free to any comments.
Differential Revision: https://reviews.llvm.org/D71177
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
PowerPC has instruction to do the semantics of this piece of code:
vector int foo(vector int m, vector int n) {
return (m + n + 1) >> 1;
}
This patch is adding the match rule to select it.
Differential Revision: https://reviews.llvm.org/D71002
|
|
|
|
|
|
|
|
|
|
|
|
| |
the last csect of a sections
SUMMARY:
Fixed a bug of XCOFFObjectFile.cpp when there is padding at the last csect of a sections.
when there is a tail padding of a section, but the value of CurrentAddressLocation do not be increased by the padding size. it will hit assert assert(CurrentAddressLocation == Section->Address && "We should have no padding between sections.");
Reviewers: daltenty,hubert.reinterpretcast,
Differential Revision: https://reviews.llvm.org/D70859
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is found during https://reviews.llvm.org/D70758
All the other record forms are having suffix o at the end.
ANDIo8 and ANDISo8 are the only two that put o before 8.
This patch rename them to be consistent with others.
Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg
Reviewed By: jhibbits
Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70928
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap.
Specifically, the pass can combine a splat/swap to a splat/copy. It uses `TargetRegisterInfo::lookThruCopyLike` to determine that the operands to the splat are the same. However, the current logic only compares the operands based on register numbers. In the case where the splat operands are ultimately feed from the same physical register, the pass can incorrectly remove a swap if the feed register for one of the operands has been clobbered.
This patch adds a check to ensure that the registers feeding are both virtual registers or the operands to the splat or swap are both the same register.
Here is an example in pseudo-MIR of what happens in the test cased added in this patch:
Before PPC MI peephole optimization:
```
%arg = XVADDDP %0, %1
$f1 = COPY %arg.sub_64
call double rint(double)
%res.first = COPY $f1
%vec.res.first = SUBREG_TO_REG 1, %res.first, %subreg.sub_64
%arg.swapped = XXPERMDI %arg, %arg, 2
$f1 = COPY %arg.swapped.sub_64
call double rint(double)
%res.second = COPY $f1
%vec.res.second = SUBREG_TO_REG 1, %res.second, %subreg.sub_64
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
%vec.res = XXPERMDI %vec.res.splat, %vec.res.splat, 2
; %vec.res == [ %vec.res.second[0], %vec.res.first[0] ]
```
After optimization:
```
; ...
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
; lookThruCopyLike(%vec.res.first) == lookThruCopyLike(%vec.res.second) == $f1
; so the pass replaces the swap with a copy:
%vec.res = COPY %vec.res.splat
; %vec.res == [ %vec.res.first[0], %vec.res.second[0] ]
```
As best as I can tell, this has occurred since r288152, which added support for lowering certain vector operations to direct moves in the form of a splat.
Committed for vddvss (Colin Samples). Thanks Colin for the patch!
Differential Revision: https://reviews.llvm.org/D69497
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead.
To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks:
make sure there is at least one duplication in current work set.
the number of duplication should not exceed the number of successors.
The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain.
Differential Revision: https://reviews.llvm.org/D64376
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xcoffobject file
SUMMARY:
in the patch https://reviews.llvm.org/D66969 . we need a test case to verify the out text section of the xcoffobject file is correct or not.
but we do not have llvm disassembly tools to dump the xcoffobjectfile . since we commit the patch https://reviews.llvm.org/D70255, we have tools for it. we create this test case for it.
Reviewers: daltenty,hubert.reinterpretcast,
Differential Revision: https://reviews.llvm.org/D70719
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Previously we only handled the case where the csect hadn't been set up yet, so we'd hit an assert later on.
Reviewers: jasonliu, DiggerLin, stevewan
Reviewed By: jasonliu
Subscribers: hubert.reinterpretcast, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71032
|
|
|
|
|
|
|
|
|
|
| |
propagation.
Fix assertion error
```
bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed.
```
by checking if the register is 0 before invoking `isRenamable`.
|
|
|
|
|
|
|
| |
propagation"
This reverts commit 75b3a1c318ccad0f96c38689279bc5db63e2ad05, since it
breaks bootstrap build.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch mainly do such transformation
```
$R0 = OP ...
... // No read/clobber of $R0 and $R1
$R1 = COPY $R0 // $R0 is killed
```
Replace $R0 with $R1 and remove the COPY, we have
```
$R1 = OP ...
```
This transformation can also expose more opportunities for existing
copy elimination in MCP.
Differential Revision: https://reviews.llvm.org/D67794
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implement emitTCEntry for PPCTargetXCOFFStreamer.
Add TC csects to TOCCsects for object file writing.
Note:
1. I did not include any raw data testing for this object file generation
because TC entries raw data will all be 0 without relocation implemented.
I will add raw data testing as part of relocation testing later.
2. I removed "Symbol->setFragment(F);" for common symbols because we
don't need it, and if we have it then we would hit assertions below:
Assertion `(SymbolContents == SymContentsUnset ||
SymbolContents == SymContentsOffset) &&
"Cannot get offset for a common/variable symbol"' failed.
3.Fixed incorrect TOC-base alignment.
Differential Revision: https://reviews.llvm.org/D70798
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example:
x3 = rlwinm x3, 27, 5, 31
x3 = rlwinm x3, 19, 0, 12
can be combined to
x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang, lkail
Differential Revision: https://reviews.llvm.org/D70374
|
|
|
|
|
|
|
|
| |
This is an alternative to D64662 that shares more code between
strict and non-strict nodes. It's modeled after the implementation
that I did for softening.
Differential Revision: https://reviews.llvm.org/D70867
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loops."
Summary: b19ec1eb3d0c has been reverted because of the test failures
with PowerPC targets. This patch addresses the issues from the previous
commit.
Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll
and CodeGen/PowerPC/sms-cpy-1.ll pass
Subscribers: llvm-commits
|
|
|
|
|
|
|
|
|
|
|
|
| |
When converting reg+reg shifts to reg+imm rotates, we neglect to consider the
CodeGenOnly versions of the 32-bit shift mnemonics. This means we produce a
rotate with missing operands which causes a crash.
Committing this fix without review since it is non-controversial that the list
of mnemonics to consider should include the 64-bit aliases for the exact
mnemonics.
Fixes PR44183.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds LowerFormalArguments_AIX, support is added for lowering
int, float, and double formal arguments into general purpose and
floating point registers only.
The aix calling convention testcase have been redone to test for caller
and callee functionality in the same lit test.
Patch by Zarko Todorovski!
Differential Revision: https://reviews.llvm.org/D69578
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Emit the correct .toc psuedo op when we change to the TOC and emit
TC entries. Make sure TOC psuedos get the right symbols via overriding
getMCSymbolForTOCPseudoMO on AIX. Add a test for TOC assembly writing
and update tests to include TOC entries.
Also make sure external globals have a csect set and handle external function descriptor (originally authored by Jason Liu) so we can emit TOC entries for them.
Reviewers: DiggerLin, sfertile, Xiangling_L, jasonliu, hubert.reinterpretcast
Reviewed By: jasonliu
Subscribers: arphaman, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70461
|
|
|
|
|
|
|
|
|
|
| |
This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC backend. At this point the patch simply
assumes that a future CPU will have the same characteristics as pwr9. Those
characteristics may change with later patches.
Differential Revision: https://reviews.llvm.org/D70333
|
|
|
|
|
|
|
|
|
|
| |
Afer https://reviews.llvm.org/D67088, PPCLoopPreIncPrep pass can prepare more instruction forms except pre inc form, like DS/DQ forms.
This patch is a follow-up of https://reviews.llvm.org/D67088 to rename the pass name.
Reviewed by: jsji
Differential Revision: https://reviews.llvm.org/D70371
|
|
|
|
|
|
| |
This is a follow up commit to address post-commit comment in D70443
Differential revision: https://reviews.llvm.org/D70443
|
|
|
|
|
|
|
|
| |
If an inline asm statement clobbers a VSX register that overlaps with a
callee-saved Altivec register or FPR, we will not record the clobber and will
therefore violate the ABI. This is clearly a bug so this patch fixes it.
Differential revision: https://reviews.llvm.org/D68576
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
call
Summary:
This patch sets up the infrastructure for
1. Associate MCSymbolXCOFF with an MCSectionXCOFF when it could not
get implicitly associated.
2. Generate undefined symbols. The patch itself generates undefined symbol
for external function call only. Generate undefined symbol for external
global variable and external function descriptors will be handled in
separate patch(s) after this is land.
Differential Revision: https://reviews.llvm.org/D70443
|
| |
|
|
|
|
| |
This reverts commit 29f6f9b2b2bfecccf903738e2f5a0cd0a70fce31.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch aims to spill CR[0-7]LT bits on POWER9 using the setb instruction.
The sequence on P9 to spill these bits will be:
setb %reg, %CRREG
stw %reg, $FI
Instead of the typical sequence:
mfocrf %reg, %CRREG
rlwinm %reg1, %reg, $SH, 0, 0
stw %reg1, $FI
Differential Revision: https://reviews.llvm.org/D68443
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is a follow up on read-only assembly patch D70182.
It intends to enable object file generation for the read-only data section on AIX.
Reviewers: DiggerLin, daltenty
Differential Revision: https://reviews.llvm.org/D70455
|
|
|
|
|
|
|
| |
Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type.
Mark it as legal and add the match pattern.
Differential Revision: https://reviews.llvm.org/D69601
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
combine
x3 = rlwinm x3, 27, 5, 31
x3 = rlwinm x3, 19, 0, 12
to
x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang
Differential Revision: https://reviews.llvm.org/D70374
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
strings
This patch adds support for generating assembly code for one-byte mergeable strings.
Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later.
Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L
Reviewed by: daltenty
Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70310
|
|
|
|
|
|
|
|
|
|
| |
This patch lowering jump table, constant pool and block address in assembly.
1. On AIX, jump table index is always relative;
2. Put CPI and JTI into ReadOnlySection until we support unique data sections;
3. Create the temp symbol for block address symbol;
4. Update MIR testcases and add related assembly part;
Differential Revision: https://reviews.llvm.org/D70243
|
|
|
|
|
|
| |
This patch implements writing function descriptors and TOC base into
data section, and also add function descriptors(both csect and label)
and TOC base symbols to the symbol table.
|
| |
|