| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
legalization.
If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values.
Patch by Jatin Bhateja and Eli Friedman.
Differential Revision: https://reviews.llvm.org/D34240
llvm-svn: 307207
|
|
|
|
| |
llvm-svn: 307184
|
|
|
|
|
|
|
|
|
|
| |
into one with combined shift operand.
For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize.
Differential revision: https://reviews.llvm.org/D12833
llvm-svn: 307179
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
matcher and emitter state machines.
Summary:
Also, made a few minor tweaks to shave off a little more cumulative memory consumption:
* All rules share a single NewMIs instead of constructing their own. Only one
will end up using it.
* Use MIs.resize(1) instead of MIs.clear();MIs.push_back(I) and prevent
GIM_RecordInsn from changing MIs[0].
Depends on D33764
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33766
llvm-svn: 307159
|
|
|
|
|
|
|
|
|
|
| |
We used to have a helper that replaced an instruction with a libcall.
That turns out to be too aggressive, since sometimes we need to replace
the instruction with at least two libcalls. Therefore, change our
existing helper to only create the libcall and leave the instruction
removal as a separate step. Also rename the helper accordingly.
llvm-svn: 307149
|
|
|
|
| |
llvm-svn: 307144
|
|
|
|
|
|
| |
This isn't used anywhere yet, but I need it for a future commit.
llvm-svn: 307141
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Otherwise the fallback path fails with an assertion on x86_64 targets,
when "x86_fp80" is encountered.
Reviewers: t.p.northover, zvi, guyblank
Reviewed By: zvi
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34975
llvm-svn: 307140
|
|
|
|
|
|
|
| |
Add a helper for building simple binary ops like add, mul, sub, and.
This can be used in the future for quickly adding support for or, xor.
llvm-svn: 307139
|
|
|
|
|
|
| |
after r307133
llvm-svn: 307138
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
matcher.
Summary:
This further improves the compile-time regressions that will be caused by a
re-commit of r303259.
Also added included preliminary work in preparation for the multi-insn emitter
since I needed to change the relevant part of the API for this patch anyway.
Depends on D33758
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33764
llvm-svn: 307133
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relanding after rewriting undef.ll test to avoid host-dependant
endianness.
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.
Tests of note:
* test/CodeGen/X86/build-vector* - Improved.
* test/CodeGen/BPF/undef.ll - Improved store alignment allows an
additional store merge
* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
case we already do not handle well. Here, the DAG is improved, but
scheduling causes a code size degradation.
Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D34472
llvm-svn: 307114
|
|
|
|
| |
llvm-svn: 307094
|
|
|
|
|
|
| |
function's begin. NFC precommit for D12833.
llvm-svn: 307091
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We are crashing in LLC at O0 when gc intrinsics are present in the block.
The reason being FastISel performs basic block ISel by modifying GC.relocates
to be the first instruction in the block. This can cause us to visit the GC
relocate before it's corresponding GC.statepoint is visited, which is incorrect.
When we lower the statepoint, we record the base and derived pointers, along
with the gc.relocates. After this we can visit the gc.relocate.
This patch avoids fastISel from incorrectly creating the block with gc.relocate
as the first instruction.
Reviewers: qcolombet, skatkov, qikon, reames
Reviewed by: skatkov
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34421
llvm-svn: 307084
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
|
|
|
|
|
|
| |
addresses are comparable. NFCI.
llvm-svn: 307055
|
|
|
|
|
|
|
|
| |
The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265.
Differential Revision: https://reviews.llvm.org/D31946
llvm-svn: 307053
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add a combine for creating a truncate to replace a build_vector composed of extracts with
indices that form a stride-2^N series.
Example:
v8i32 V = ...
v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6))
-->
v4i32 truncate (bitcast V to v4i64)
Related discussion in llvm-dev about canonicalizing shuffles to
truncates in LLVM IR:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html.
Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena
Reviewed By: delena
Subscribers: guyblank, delena, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D34077
llvm-svn: 307036
|
|
|
|
| |
llvm-svn: 307004
|
|
|
|
|
|
| |
prevent a crash if the type isn't a simple VT.
llvm-svn: 306950
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
removePartialredunduncy and in WorkList
Summary:
removePartialRedundency optimization introduces a state in the
RegisterCoalescer where an instruction pointed to in the WorkList
is deleted from the MBB and then removed from the ErasedList.
This patch updates the ErasedList to be used globally by not erasing
erased Instructions from it to solve the problem.
The patch also accounts for the case where an Instruction was previously
deleted and the same memory was reused by BuildMI to create a new instruction.
Reviewers: kparzysz, qcolombet
Reviewed By: qcolombet
Subscribers: MatzeB, qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D34902
llvm-svn: 306915
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add an option to prevent diagnostics that do not meet a minimum hotness
threshold from being output. When generating optimization remarks for
large codebases with a ton of cold code paths, this option can be used
to limit the optimization remark output at a reasonable size. Discussion of
this change can be read here:
http://lists.llvm.org/pipermail/llvm-dev/2017-June/114377.html
Reviewers: anemet, davidxl, hfinkel
Reviewed By: anemet
Subscribers: qcolombet, javed.absar, fhahn, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D34867
llvm-svn: 306912
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the instructions at the beginning of the block have no location,
we're better off using the location of the first instruction in the
current basic block. At the very least, that instruction post-dominates
this one, whereas if we don't emit a .cv_loc directive, we end up using
the potentially invalid location that falls through from the previous
block.
We could probably do better here by emitting some kind of ".cv_loc end"
directive that stops the line table entry of the previous .cv_loc
directive from bleeding out of its basic block. This would improve the
line table when an entire MBB has no valid location info.
llvm-svn: 306889
|
|
|
|
|
|
|
|
|
| |
It looks like there are two target-independent but not GISel instructions that
need legalization, IMPLICIT_DEF and PHI. These are already anomalies since
their operands have important LLTs attached, so to make things more uniform it
seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF.
llvm-svn: 306875
|
|
|
|
|
|
|
| |
I'm tired of seeing this:
.globl "?Test@@YAXXZ" # -- Begin function ^A?Test@@YAXXZ
llvm-svn: 306855
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
To enable profile hotness information in diagnostics output, Clang takes
the option `-fdiagnostics-show-hotness` -- that's "diagnostics", with an
"s" at the end. Clang also defines `CodeGenOptions::DiagnosticsWithHotness`.
LLVM, on the other hand, defines
`LLVMContext::getDiagnosticHotnessRequested` -- that's "diagnostic", not
"diagnostics". It's a small difference, but it's confusing, typo-inducing, and
frustrating.
Add a new method with the spelling "diagnostics", and "deprecate" the
old spelling.
Reviewers: anemet, davidxl
Reviewed By: anemet
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D34864
llvm-svn: 306848
|
|
|
|
|
|
|
| |
This reverts commit r306819 which appears be exposing underlying
issues in a stage1 ppc64be build
llvm-svn: 306820
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.
Tests of note:
* test/CodeGen/X86/build-vector* - Improved.
* test/CodeGen/BPF/undef.ll - Improved store alignment allows an
additional store merge
* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
case we already do not handle well. Here, the DAG is improved, but
scheduling causes a code size degradation.
Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D34472
llvm-svn: 306819
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r301116, a custom lowering needed to be introduced to be able to
legalize 8 and 16-bit divisions on ARM targets without a division
instruction, since 2-step legalization (WidenScalar from 8 bit to 32
bit, then Libcall the 32-bit division) doesn't work.
This fixes this and makes this kind of multi-step legalization, where
first the size of the type needs to be changed and then some action is
needed that doesn't require changing the size of the type,
straighforward to specify.
Differential Revision: https://reviews.llvm.org/D32529
llvm-svn: 306806
|
|
|
|
|
|
|
|
| |
Reviewer: dblaikie
Differential revision: https://reviews.llvm.org/D34765
llvm-svn: 306771
|
|
|
|
|
|
| |
https://reviews.llvm.org/D34837
llvm-svn: 306766
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code,
```
BB#1: ; Loop Header
%vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
...
BB#6: ; Loop Latch
%vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
%vreg10<def,tied1> = ADD32rr %vreg1<kill,tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1,%vreg0
%vreg3<def,tied1> = ADD32rr %vreg2<kill,tied0>, %vreg10<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2,%vreg10
CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
%vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
JL_1 <BB#1>, %EFLAGS<imp-use,kill>
```
Existing two-address generation pass generates following code:
```
BB#1:
%vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
...
BB#6:
Predecessors according to CFG: BB#5 BB#4
%vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
%vreg10<def> = COPY %vreg1<kill>; GR32:%vreg10,%vreg1
%vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg0
%vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
%vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
%vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
JL_1 <BB#1>, %EFLAGS<imp-use,kill>
JMP_1 <BB#7>
```
This is suboptimal because the assembly code generated has a redundant copy at the end of #BB6 to feed %vreg13 to BB#1:
```
.LBB0_6:
addl %esi, %edi
addl %ebx, %edi
cmpl $10, %edi
movl %edi, %esi
jl .LBB0_1
```
This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes
```
BB#1:
%vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
...
BB#6: derived from LLVM BB %bb7
Predecessors according to CFG: BB#5 BB#4
%vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
%vreg10<def> = COPY %vreg0<kill>; GR32:%vreg10,%vreg0
%vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg1<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1
%vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
%vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
%vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
JL_1 <BB#1>, %EFLAGS<imp-use,kill>
JMP_1 <BB#7>
```
and the final assembly does not have redundant copy:
```
.LBB0_6:
addl %edi, %eax
addl %ebx, %eax
cmpl $10, %eax
jl .LBB0_1
```
Reviewers: qcolombet, MatzeB, wmi
Reviewed By: wmi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31821
llvm-svn: 306758
|
|
|
|
|
|
|
| |
This reverts commit r305389. This broke chromium builds, so reverting
while I investigate further.
llvm-svn: 306741
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Arguably non-integral pointers probably shouldn't show up here at all,
but since the backend doesn't complain and this takes valid (according
to the Verifier) IR and makes it invalid, make sure not to introduce
any inttoptr instructions if we're dealing with non-integral pointers.
Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D33110
llvm-svn: 306737
|
|
|
|
| |
llvm-svn: 306716
|
|
|
|
|
|
|
|
|
|
|
| |
Relanding after restricting equalBaseIndex to not erroneuosly consider
a FrameIndices stemming from alloca from being comparable as its
offset is set post-selectionDAG.
Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.
llvm-svn: 306688
|
|
|
|
|
|
|
|
|
|
| |
I am 99% sure that this breaks the PPC ASAN build bot:
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio
If it doesn't go back to green, we can recommit (and fix the original
commit message at the same time :) ).
llvm-svn: 306676
|
|
|
|
|
|
|
|
|
|
|
| |
Given no NaNs and no signed zeroes it folds:
(fmul X, (select (fcmp X > 0.0), -1.0, 1.0)) -> (fneg (fabs X))
(fmul X, (select (fcmp X > 0.0), 1.0, -1.0)) -> (fabs X)
Differential Revision: https://reviews.llvm.org/D34579
llvm-svn: 306592
|
|
|
|
| |
llvm-svn: 306557
|
|
|
|
| |
llvm-svn: 306553
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.
Majority of the changes in this patch:
1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.
These changes are target independent and described below.
Changed CFI instructions so that they:
1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal
Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.
Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D18046
llvm-svn: 306529
|
|
|
|
|
|
| |
This reverts commit r306498 which appears to cause a compilrt-rt test failures
llvm-svn: 306501
|
|
|
|
|
|
|
|
|
|
|
| |
That is pretty common for clang to produce code like
(shl %x, (and %amt, 31)). In this situation we can still perform
trunc (shl) into shl (trunc) conversion given the known value
range of shift amount.
Differential Revision: https://reviews.llvm.org/D34723
llvm-svn: 306499
|
|
|
|
|
|
|
| |
Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.
llvm-svn: 306498
|
|
|
|
| |
llvm-svn: 306485
|
|
|
|
| |
llvm-svn: 306481
|
|
|
|
|
|
|
| |
Also add IRTranslator support.
https://reviews.llvm.org/D34710
llvm-svn: 306475
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in D34071, there are some IR optimization opportunities that could be
handled by normal IR passes if this expansion wasn't happening so late in CGP.
Regardless of that, it seems wasteful to knowingly produce suboptimal IR here,
so I'm proposing this change:
%s = sub i32 %x, %y
%r = icmp ne %s, 0
=>
%r = icmp ne %x, %y
Changing the predicate to 'eq' mimics what InstCombine would do, so that's just
an efficiency improvement if we decide this expansion should happen sooner.
The fact that the PowerPC backend doesn't eliminate the 'subf.' might be
something for PPC folks to investigate separately.
Differential Revision: https://reviews.llvm.org/D34416
llvm-svn: 306471
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this check, COPY instructions can actually be one of the generic casts
in disguise. That's confusing and bad.
At some point during ISel this restriction has to be relaxed since the fully
selected instructions will usually use COPY for those purposes. Right now I
think it's possible that relaxation occurs during RegBankSelect (hence the
change there). I'm not convinced that's where it belongs long-term though.
llvm-svn: 306470
|