| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic in this function assumes that the P8 supports fusion of addis/addi,
but it does not. As a result, there is no advantage to restricting our peephole
application, merging addi instructions into dependent memory accesses, even
when the addi has multiple users, regardless of whether or not we're optimizing
for size.
We might need something like this again for the P9; I suspect we'll revisit
this code when we work on P9 tuning.
llvm-svn: 280440
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible
__builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently
broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is
lowered using:
ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET)
where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86,
FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not
work for PowerPC. Because of the way that the stack layout works, the canonical
frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC
(there is a lower save-area offset as well), so it is not just a matter of
implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its
semantics -- We can do that, since it is currently used only for
@llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct
itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips
currently does this, but by using a custom lowering for ADD that specifically
recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern.
This change introduces a ISD::EH_DWARF_CFA node, which by default expands using
the existing logic, but can be directly lowered by the target. Mips is updated
to use this method (which simplifies its implementation, and I suspect makes it
more robust), and updates PowerPC to do the same.
Fixes PR26761.
Differential Revision: https://reviews.llvm.org/D24038
llvm-svn: 280350
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a function contains something, such as inline asm, which explicitly
clobbers the register used as the frame pointer, don't spill it twice. If we
need a frame pointer, it will be saved/restored in the prologue/epilogue code.
Explicitly spilling it again will reuse the same spill slot used by the
prologue/epilogue code, thus clobbering the saved value. The same applies
to the base-pointer or PIC-base register.
Partially fixes PR26856. Thanks to Ulrich for his analysis and the small
inline-asm reproducer.
llvm-svn: 280188
|
|
|
|
|
|
|
|
|
| |
Implement Bill's suggested fix for 32-bit targets for PR22711 (for the
alignment of each entry). As pointed out in the bug report, we could just force
the section alignment, since we only add pointer-sized things currently, but
this fix is somewhat more future-proof.
llvm-svn: 280049
|
|
|
|
|
|
|
|
|
|
|
| |
The "long call" option forces the use of the indirect calling sequence for all
calls (even those that don't really need it). GCC provides this option; This is
helpful, under certain circumstances, for building very-large binaries, and
some other specialized use cases.
Fixes PR19098.
llvm-svn: 280040
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For little-Endian PowerPC, we generally target only P8 and later by default.
However, generic (older) 64-bit configurations are still an option, and in that
case, partword atomics are not available (e.g. stbcx.). To lower i8/i16 atomics
without true i8/i16 atomic operations, we emulate using i32 atomics in
combination with a bunch of shifting and masking, etc. The amount by which to
shift in little-Endian mode is different from the amount in big-Endian mode (it
is inverted -- meaning we can leave off the xor when computing the amount).
Fixes PR22923.
llvm-svn: 280022
|
|
|
|
|
|
| |
Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818.
llvm-svn: 279933
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
compute it
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.
Differential Revision: http://reviews.llvm.org/D23850
llvm-svn: 279698
|
|
|
|
|
|
| |
General cleanup before starting to work on the part I want to actually change.
llvm-svn: 279586
|
|
|
|
| |
llvm-svn: 279409
|
|
|
|
| |
llvm-svn: 279408
|
|
|
|
|
|
|
|
|
|
| |
The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.
Differential Revision: https://reviews.llvm.org/D23597
llvm-svn: 279129
|
|
|
|
|
|
|
| |
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.
llvm-svn: 278902
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a quick work around, because in some cases, e.g. caller's stack
size > callee's stack size, we are still able to apply sibling call
optimization even callee has any byval arg.
This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328
Reviewers: hfinkel kbarton nemanjai amehsan
Subscribers: hans, tjablin
https://reviews.llvm.org/D23441
llvm-svn: 278900
|
|
|
|
|
|
|
|
| |
Following the discussion on D22038, this refactors a PowerPC specific setcc -> srl(ctlz) transformation so it can be used by other targets.
Differential Revision: https://reviews.llvm.org/D23445
llvm-svn: 278799
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: It triggers exponential behavior when the DAG has many branches.
Reviewers: hfinkel, kbarton
Subscribers: iteratee, nemanjai, echristo
Differential Revision: https://reviews.llvm.org/D23428
llvm-svn: 278548
|
|
|
|
|
|
| |
No functionality change is intended.
llvm-svn: 278475
|
|
|
|
|
|
| |
No functionality change is intended.
llvm-svn: 278417
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were two locations where fast-isel would generate a LFD instruction
with a target register class VSFRC instead of F8RC when VSX was enabled.
This can ccause invalid registers to be used in certain cases, like:
lfd 36, ...
instead of using a VSX load instruction. The wrong register number gets
silently truncated, causing invalid code to be generated.
The first place is PPCFastISel::PPCEmitLoad, which had multiple problems:
1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they
are computed from resultReg, which is still zero at this point in many cases.
Fixed by changing the helper routines to operate on a register class instead
of a register and passing in UseRC.
2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo:
bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS;
bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD;
The second line needs to use isVSFRC (like PPCEmitStore does).
3.) Once both the above are fixed, we're now generating a VSX instruction --
but an incorrect one, since generation of an indexed instruction with null
index is wrong. Fixed by copying the code handling the same issue in
PPCEmitStore.
The second place is PPCFastISel::PPCMaterializeFP, where we would emit an
LFD to load a constant from the literal pool, and use the wrong result
register class. Fixed by hardcoding a F8RC class even on systems
supporting VSX.
Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630
Differential Revision: https://reviews.llvm.org/D22632
llvm-svn: 277823
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes passing long double type arguments to function in
soft float mode. If there is less than 4 argument registers free
(long double type is mapped in 4 gpr registers in soft float mode)
long double type argument must be passed through stack.
Differential Revision: https://reviews.llvm.org/D20114.
llvm-svn: 277804
|
|
|
|
|
|
|
|
| |
This patch fixes pr25548.
Current implementation of PPCBoolRetToInt doesn't handle CallInst correctly, so it failed to do the intended optimization when there is a CallInst with parameters. This patch fixed that.
llvm-svn: 277655
|
|
|
|
|
|
|
|
|
| |
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.
Differential Revision: https://reviews.llvm.org/D22885
llvm-svn: 277126
|
|
|
|
|
|
|
| |
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.
llvm-svn: 277017
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D22925
llvm-svn: 276997
|
|
|
|
|
|
| |
Fixes PR28731.
llvm-svn: 276865
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr* in the PowerPC backend, mainly by preferring MachineInstr&
over MachineInstr* when a pointer isn't nullable and using range-based
for loops.
There was one piece of questionable code in PPCInstrInfo::AnalyzeBranch,
where a condition checked a pointer converted from an iterator for
nullptr. Since this case is impossible (moreover, the code above
guarantees that the iterator is valid), I removed the check when I
changed the pointer to a reference.
Despite that case, there should be no functionality change here.
llvm-svn: 276864
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.
The corresponding change to clang is in: http://reviews.llvm.org/D16538
Patch by: Joel Jones
Differential Revision: https://reviews.llvm.org/D16213
llvm-svn: 276654
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
converting to FP
This patch corresponds to review:
https://reviews.llvm.org/D21354
We use direct moves for extracting integer elements from vectors. We also use
direct moves when converting integers to FP. When these operations are chained,
we get a direct move out of a VSR followed by a direct move back into a VSR.
These are redundant - all we need to do is line up the element and convert.
llvm-svn: 275796
|
|
|
|
|
|
| |
This fixes PR 28526.
llvm-svn: 275603
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).
Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.
This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted. It also greatly simplifies the process of adding another flag
to getLoad.
Reviewers: chandlerc, tstellarAMD
Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits
Differential Revision: http://reviews.llvm.org/D22249
llvm-svn: 275592
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.
Reviewers: tstellarAMD, mcrosier
Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai
Differential Revision: https://reviews.llvm.org/D22409
llvm-svn: 275564
|
|
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
http://reviews.llvm.org/D20239
It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that
are useful in some cases for inserting and extracting vector elements of
v4[if]32 vectors.
llvm-svn: 275215
|
|
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
http://reviews.llvm.org/D21358
Vector shifts that have the same semantics as a vector swap are cannonicalized
as such to provide additional opportunities for swap removal optimization to
remove unnecessary swaps.
llvm-svn: 275168
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation
to generate jumps with 16-bit sized immediates in 16-bit mode.
This fixes PR22097.
Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight
Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D20830
llvm-svn: 275068
|
|
|
|
|
|
| |
ourselves via a call through the DAG.
llvm-svn: 274721
|
|
|
|
| |
llvm-svn: 274720
|
|
|
|
| |
llvm-svn: 274717
|
|
|
|
| |
llvm-svn: 274716
|
|
|
|
| |
llvm-svn: 274715
|
|
|
|
| |
llvm-svn: 274714
|
|
|
|
|
|
| |
called by it.
llvm-svn: 274711
|
|
|
|
|
|
| |
and remove the argument.
llvm-svn: 274710
|
|
|
|
| |
llvm-svn: 274709
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a problem in VSXSwapRemoval where it is incorrectly removing permute instructions.
In this case, the permute is feeding both a vector store and also a non-store instruction. In this case, the permute cannot be removed.
The fix is to simply look at all the uses of the vector register defined by the permute and ensure that all the uses are vector store instructions.
This problem was reported in PR 27735 (https://llvm.org/bugs/show_bug.cgi?id=27735).
Test case based on the original problem reported.
Phabricator Review: http://reviews.llvm.org/D21802
llvm-svn: 274645
|
|
|
|
| |
llvm-svn: 274636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
http://reviews.llvm.org/D20443
It changes the legalization strategy for illegal vector types from integer
promotion to widening. This only applies for vectors with elements of width
that is a multiple of a byte since we have hardware support for vectors with
1, 2, 3, 8 and 16 byte elements.
Integer promotion for vectors is quite expensive on PPC due to the sequence
of breaking apart the vector, extending the elements and reconstituting the
vector. Two of these operations are expensive.
This patch causes between minor and major improvements in performance on most
benchmarks. There are very few benchmarks whose performance regresses. These
regressions can be handled in a subsequent patch with a DAG combine (similar
to how this patch handles int -> fp conversions of illegal vector types).
llvm-svn: 274535
|
|
|
|
|
|
|
|
| |
where possible.
No functionality change intended.
llvm-svn: 274431
|
|
|
|
|
|
|
|
|
|
| |
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator. One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.
llvm-svn: 274304
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr. In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
llvm-svn: 274287
|
|
|
|
|
|
|
| |
MC doesn't really care about CodeGen stuff, so this was just
complicating target initialization.
llvm-svn: 274258
|