| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 301101
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.
In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.
The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
return OptionalComplexRendererFn(
[=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.
As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.
Depends on D31418
Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar
Reviewed By: ab
Subscribers: dberris, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D31761
llvm-svn: 301079
|
| |
|
|
|
|
|
|
|
|
|
| |
The code assumed that when saving an additional CSR register
(ExtraCSSpill==true) we would have a free register throughout the
function. This was not true if this CSR register is also used to pass
values as in the swiftself case.
rdar://31451816
llvm-svn: 301057
|
| |
|
|
|
|
|
|
|
| |
In addition to the original commit, tighten the condition for when to
pad empty functions to COFF Windows. This avoids running into problems
when targeting e.g. Win32 AMDGPU, which caused test failures when this
was committed initially.
llvm-svn: 301047
|
| |
|
|
|
|
| |
This broke almost all bots. Reverting while fixing.
llvm-svn: 301041
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Empty functions can lead to duplicate entries in the Guard CF Function
Table of a binary due to multiple functions sharing the same RVA,
causing the kernel to refuse to load that binary.
We had a terrific bug due to this in Chromium.
It turns out we were already doing this for Mach-O in certain
situations. This patch expands the code for that in
AsmPrinter::EmitFunctionBody() and renames
TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it
seems it was used for not just Mach-O anyway.
Differential Revision: https://reviews.llvm.org/D32330
llvm-svn: 301040
|
| |
|
|
|
|
|
|
|
| |
Otherwise there's some mismatch, and we'll either form an illegal type or an
illegal node.
Thanks to Eli Friedman for pointing out the problem with my original solution.
llvm-svn: 301036
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D32363
llvm-svn: 301029
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D32361
llvm-svn: 301028
|
| |
|
|
|
|
|
|
| |
- We really ought to zero out lower 16 bits
Differential Revision: https://reviews.llvm.org/D32356
llvm-svn: 301026
|
| |
|
|
|
|
|
|
|
|
|
|
| |
SI_MASKED_UNREACHABLE does not have machine instruction encoding.
It needs special handling in AMDGPUAsmPrinter::EmitInstruction like some
other pseudo instructions.
This patch fixes compilation failure of RadeonRays.
Differential Revision: https://reviews.llvm.org/D32364
llvm-svn: 301025
|
| |
|
|
|
|
|
|
|
|
|
| |
It seems we have on situation in a sanitizer enable bootstrap build
where the return instruction has a frame index operand that does not
point to a fixed object and fails the assert added here.
This reverts commit r300923.
This reverts commit r300922.
llvm-svn: 301024
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D32085
llvm-svn: 301023
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediate operands.
This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.
This recommits r300932 and r300930, which was causing dag-combine to
loop forever. The problem was that optimizeLogicalImm was returning
true even when there was no change to the immediate node (which happened
when the immediate was all zeros or ones), which caused dag-combine to
push and pop the same node to the work list over and over again without
making any progress.
This commit fixes the bug by returning false early in optimizeLogicalImm
if the immediate is all zeros or ones. Also, it changes the code to
compare the immediate with 0 or Mask rather than calling
countPopulation.
rdar://problem/18231627
Differential Revision: https://reviews.llvm.org/D5591
llvm-svn: 301019
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Factor out the common code used for generating addresses into common
templated functions that call overloaded versions of a new function,
getTargetNode.
Tested with make check-llvm with targets AArch64.
Differential Revision: https://reviews.llvm.org/D32169
llvm-svn: 301005
|
| |
|
|
|
|
|
|
| |
DAG combine was mistakenly assuming that the step-up it was looking at was
always a doubling, but it can sometimes be a larger extension in which case
we'd crash.
llvm-svn: 301002
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300993
|
| |
|
|
| |
llvm-svn: 300987
|
| |
|
|
|
|
|
|
|
|
|
| |
This revision documents the combination of C++ and table-gen code that
handles relocations and addresses.
Thanks for Simon Dardis for the careful reviews.
Differential Revision: https://reviews.llvm.org/D31628
llvm-svn: 300986
|
| |
|
|
| |
llvm-svn: 300984
|
| |
|
|
|
|
|
|
|
| |
predicates and support the equivalent in GIRule.
It's causing llvm-clang-x86_64-expensive-checks-win to fail to compile and I
haven't worked out why. Reverting to make it green while I figure it out.
llvm-svn: 300978
|
| |
|
|
| |
llvm-svn: 300976
|
| |
|
|
| |
llvm-svn: 300975
|
| |
|
|
| |
llvm-svn: 300974
|
| |
|
|
|
|
|
|
| |
Select them as copies. We only select if both the source and the
destination are on the same register bank, so this shouldn't cause any
trouble.
llvm-svn: 300971
|
| |
|
|
|
|
|
|
|
|
|
| |
The condition in isSupportedType didn't handle struct/array arguments
properly. Fix the check and add a test to make sure we use the fallback
path in this kind of situation. The test deals with some common cases
where the call lowering should error out. There are still some issues
here that need to be addressed (tail calls come to mind), but they can
be addressed in other patches.
llvm-svn: 300967
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).
Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.
Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab
Reviewed By: rovka
Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D31418
llvm-svn: 300964
|
| |
|
|
| |
llvm-svn: 300963
|
| |
|
|
| |
llvm-svn: 300960
|
| |
|
|
| |
llvm-svn: 300959
|
| |
|
|
|
|
|
|
|
|
|
|
| |
when the subtarget has fast strings.
This has two advantages:
- Speed is improved. For example, on Haswell thoughput improvements increase
linearly with size from 256 to 512 bytes, after which they plateau:
(e.g. 1% for 260 bytes, 25% for 400 bytes, 40% for 508 bytes).
- Code is much smaller (no need to handle boundaries).
llvm-svn: 300957
|
| |
|
|
| |
llvm-svn: 300952
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
`Uses = [CPSR]`
Summary: Thanks to Oliver Stannard for helping catch this.
Reviewers: olista01, efriedma
Subscribers: llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D31815
llvm-svn: 300951
|
| |
|
|
|
|
|
|
|
| |
It seems that r300930 was creating an infinite loop in dag-combine when
compling the following file:
MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c
llvm-svn: 300940
|
| |
|
|
| |
llvm-svn: 300932
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediate operands.
This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.
This recommits r300913, which broke bots because I didn't fix a call to
ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of
TargetLoweringOpt and TargetLowering.
rdar://problem/18231627
Differential Revision: https://reviews.llvm.org/D5591
llvm-svn: 300930
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
X86RegisterInfo::eliminateFrameIndex() and
X86FrameLowering::getFrameIndexReference() both had logic to compute the
base register. This consolidates the code.
Also use MachineInstr::isReturn instead of manually enumerating tail
call instructions (return instructions were not included in the previous
list because they never reference frame indexes).
Differential Revision: https://reviews.llvm.org/D32206
llvm-svn: 300923
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
AfterFPPop is used for tailcall/tailjump instructions. We shouldn't ever
have frame-pointer/base-pointer relative addressing for those. After all
the frame/base pointer should already be restored to their previous
values at the return.
Make this fact explicit in preparation for an upcoming refactoring.
Differential Revision: https://reviews.llvm.org/D32205
llvm-svn: 300922
|
| |
|
|
|
|
|
|
| |
This reverts r300913.
This broke bots.
llvm-svn: 300916
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediate operands.
This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.
rdar://problem/18231627
Differential Revision: https://reviews.llvm.org/D5591
llvm-svn: 300913
|
| |
|
|
|
|
|
|
| |
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.
llvm-svn: 300905
|
| |
|
|
|
|
|
|
| |
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.
llvm-svn: 300904
|
| |
|
|
| |
llvm-svn: 300893
|
| |
|
|
| |
llvm-svn: 300892
|
| |
|
|
|
|
|
|
|
|
|
| |
Before, we assumed that any ConstantInt offset was precisely the access width,
so we could use the "[rN]!" form. ISelLowering only ever created that kind, but
further simplification during combining could lead to unexpected constants and
incorrect codegen.
Should fix PR32658.
llvm-svn: 300878
|
| |
|
|
| |
llvm-svn: 300871
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When synthesized TBB/TBH is expanded, we need to avoid the case of:
BaseReg is redefined after the load of branching target. E.g.:
%R2 = tLEApcrelJT <jt#1>
%R1 = tLDRr %R1, %R2 ==> %R2 = tLEApcrelJT <jt#1>
%R2 = tLDRspi %SP, 12 %R2 = tLDRspi %SP, 12
tBR_JTr %R1 tTBB_JT %R2, %R1
`
Reviewers: jmolloy
Reviewed By: jmolloy
Subscribers: llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D32250
llvm-svn: 300870
|
| |
|
|
|
|
|
|
| |
This will become asan errors once the patch lands that poisons the
memory after free. The x86 change is a hack, but I don't see how to
solve this properly at the moment.
llvm-svn: 300867
|
| |
|
|
|
|
|
|
| |
Subscribers: jfb, dschuff
Differential Revision: https://reviews.llvm.org/D32300
llvm-svn: 300859
|
| |
|
|
|
|
|
|
| |
getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask.
Differential Revision: https://reviews.llvm.org/D32108
llvm-svn: 300856
|