| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 307583
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.
Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.
Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.
The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.
Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand
Reviewed By: rnk
Subscribers: sdardis, nemanjai, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33345
llvm-svn: 307546
|
| |
|
|
| |
llvm-svn: 307533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
WidenVSELECTAndMask can fold (and it folds in this case) so we
get a BUILD_VECTOR of constants as mask. convertMask() seems to
work fine when the input is a vector of constants, and we still
need to call it to extend/add elements at the end. but the current
code just asserts on anything but a SETCC or AND/OR/XOR of 2xSETCC.
This change was discussed briefly with Simon Pilgrim, who also
suggests we might consider dropping this assertion in the future.
Fixes PR33715.
llvm-svn: 307508
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This change fixes a bug in SelectionDAGBuilder::visitInsertValue and SelectionDAGBuilder::visitExtractValue where constant expressions (InsertValueConstantExpr and ExtractValueConstantExpr) would be treated as non-constant instructions (InsertValueInst and ExtractValueInst). This bug resulted in an incorrect memory access, which manifested as an assertion failure in SDValue::SDValue.
Fixes PR#33094.
Submitted on behalf of @Praetonus (Benoit Vey)
Differential Revision: https://reviews.llvm.org/D34538
llvm-svn: 307502
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: FastISel was marked as failed in case instruction selection succeeded.
Reviewers: qcolombet, zvi, rovka, ab
Reviewed By: zvi
Subscribers: javed.absar, ab, qcolombet, bogner, llvm-commits
Differential Revision: https://reviews.llvm.org/D34438
llvm-svn: 307489
|
| |
|
|
|
|
| |
sucessor -> successor
llvm-svn: 307488
|
| |
|
|
| |
llvm-svn: 307429
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When reusing a register for a new definition, the fast register allocator
used to insert a kill flag at the previous last use of that register to
inform later passes that this register is free between the redef and the
last use. However, this may be wrong when subregisters are involved.
Indeed, a partially redef would have trigger a kill of the full super
register, potentially wrongly marking all the other subregisters as
free. Given we don't track which lanes are still live, we cannot set the
kill flag in such case.
Note: This bug has been latent for about 7 years (r104056).
llvmg.org/PR33677
llvm-svn: 307428
|
| |
|
|
|
|
| |
NFC
llvm-svn: 307427
|
| |
|
|
|
|
|
|
|
| |
When scavenging for a use in instruction MI, we will reload after
that instruction and hence cannot spill uses/defs of this instruction.
This fixes http://llvm.org/PR33687
llvm-svn: 307352
|
| |
|
|
|
|
|
|
|
| |
Contrary to the stepForward()/stepBackward() method accumulate() doesn't
have a direction as defs, uses and clobbers all have the same effect.
Also improve the documentation comment.
llvm-svn: 307351
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Added MachineVerifier code to check register ties more thoroughly, especially so that physical registers that are tied are the same. This may help e.g. when creating MIR files.
Original patch by Jesper Antonsson
Reviewers: stoklund, sanjoy, qcolombet
Reviewed By: qcolombet
Subscribers: qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D34394
llvm-svn: 307259
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
During remat, some subranges might end up having invalid segments which caused problems for later
coalescing.
Added in a check to remove segments that are invalidated as part of the remat.
See http://llvm.org/PR33524
Subscribers: MatzeB, qcolombet
Differential Revision: https://reviews.llvm.org/D34391
llvm-svn: 307247
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This covers both hard and soft float.
Hard float is easy, since it's just Legal.
Soft float is more involved, because there are several different ways to
handle it based on the predicate: one and ueq need not only one, but two
libcalls to get a result. Furthermore, we have large differences between
the values returned by the AEABI and GNU functions.
AEABI functions return a nice 1 or 0 representing true and respectively
false. GNU functions generally return a value that needs to be compared
against 0 (e.g. for ogt, the value returned by the libcall is > 0 for
true). We could introduce redundant comparisons for AEABI as well, but
they don't seem easy to remove afterwards, so we do different processing
based on whether or not the result really needs to be compared against
something (and just truncate if it doesn't).
llvm-svn: 307243
|
| |
|
|
|
|
|
|
|
|
|
|
| |
legalization.
If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values.
Patch by Jatin Bhateja and Eli Friedman.
Differential Revision: https://reviews.llvm.org/D34240
llvm-svn: 307207
|
| |
|
|
| |
llvm-svn: 307184
|
| |
|
|
|
|
|
|
|
|
| |
into one with combined shift operand.
For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize.
Differential revision: https://reviews.llvm.org/D12833
llvm-svn: 307179
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
matcher and emitter state machines.
Summary:
Also, made a few minor tweaks to shave off a little more cumulative memory consumption:
* All rules share a single NewMIs instead of constructing their own. Only one
will end up using it.
* Use MIs.resize(1) instead of MIs.clear();MIs.push_back(I) and prevent
GIM_RecordInsn from changing MIs[0].
Depends on D33764
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33766
llvm-svn: 307159
|
| |
|
|
|
|
|
|
|
|
| |
We used to have a helper that replaced an instruction with a libcall.
That turns out to be too aggressive, since sometimes we need to replace
the instruction with at least two libcalls. Therefore, change our
existing helper to only create the libcall and leave the instruction
removal as a separate step. Also rename the helper accordingly.
llvm-svn: 307149
|
| |
|
|
| |
llvm-svn: 307144
|
| |
|
|
|
|
| |
This isn't used anywhere yet, but I need it for a future commit.
llvm-svn: 307141
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Otherwise the fallback path fails with an assertion on x86_64 targets,
when "x86_fp80" is encountered.
Reviewers: t.p.northover, zvi, guyblank
Reviewed By: zvi
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34975
llvm-svn: 307140
|
| |
|
|
|
|
|
| |
Add a helper for building simple binary ops like add, mul, sub, and.
This can be used in the future for quickly adding support for or, xor.
llvm-svn: 307139
|
| |
|
|
|
|
| |
after r307133
llvm-svn: 307138
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
matcher.
Summary:
This further improves the compile-time regressions that will be caused by a
re-commit of r303259.
Also added included preliminary work in preparation for the multi-insn emitter
since I needed to change the relevant part of the API for this patch anyway.
Depends on D33758
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33764
llvm-svn: 307133
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relanding after rewriting undef.ll test to avoid host-dependant
endianness.
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.
Tests of note:
* test/CodeGen/X86/build-vector* - Improved.
* test/CodeGen/BPF/undef.ll - Improved store alignment allows an
additional store merge
* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
case we already do not handle well. Here, the DAG is improved, but
scheduling causes a code size degradation.
Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D34472
llvm-svn: 307114
|
| |
|
|
| |
llvm-svn: 307094
|
| |
|
|
|
|
| |
function's begin. NFC precommit for D12833.
llvm-svn: 307091
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We are crashing in LLC at O0 when gc intrinsics are present in the block.
The reason being FastISel performs basic block ISel by modifying GC.relocates
to be the first instruction in the block. This can cause us to visit the GC
relocate before it's corresponding GC.statepoint is visited, which is incorrect.
When we lower the statepoint, we record the base and derived pointers, along
with the gc.relocates. After this we can visit the gc.relocate.
This patch avoids fastISel from incorrectly creating the block with gc.relocate
as the first instruction.
Reviewers: qcolombet, skatkov, qikon, reames
Reviewed by: skatkov
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34421
llvm-svn: 307084
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
llvm-svn: 307079
|
| |
|
|
|
|
| |
addresses are comparable. NFCI.
llvm-svn: 307055
|
| |
|
|
|
|
|
|
| |
The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265.
Differential Revision: https://reviews.llvm.org/D31946
llvm-svn: 307053
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add a combine for creating a truncate to replace a build_vector composed of extracts with
indices that form a stride-2^N series.
Example:
v8i32 V = ...
v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6))
-->
v4i32 truncate (bitcast V to v4i64)
Related discussion in llvm-dev about canonicalizing shuffles to
truncates in LLVM IR:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html.
Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena
Reviewed By: delena
Subscribers: guyblank, delena, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D34077
llvm-svn: 307036
|
| |
|
|
| |
llvm-svn: 307004
|
| |
|
|
|
|
| |
prevent a crash if the type isn't a simple VT.
llvm-svn: 306950
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
removePartialredunduncy and in WorkList
Summary:
removePartialRedundency optimization introduces a state in the
RegisterCoalescer where an instruction pointed to in the WorkList
is deleted from the MBB and then removed from the ErasedList.
This patch updates the ErasedList to be used globally by not erasing
erased Instructions from it to solve the problem.
The patch also accounts for the case where an Instruction was previously
deleted and the same memory was reused by BuildMI to create a new instruction.
Reviewers: kparzysz, qcolombet
Reviewed By: qcolombet
Subscribers: MatzeB, qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D34902
llvm-svn: 306915
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add an option to prevent diagnostics that do not meet a minimum hotness
threshold from being output. When generating optimization remarks for
large codebases with a ton of cold code paths, this option can be used
to limit the optimization remark output at a reasonable size. Discussion of
this change can be read here:
http://lists.llvm.org/pipermail/llvm-dev/2017-June/114377.html
Reviewers: anemet, davidxl, hfinkel
Reviewed By: anemet
Subscribers: qcolombet, javed.absar, fhahn, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D34867
llvm-svn: 306912
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the instructions at the beginning of the block have no location,
we're better off using the location of the first instruction in the
current basic block. At the very least, that instruction post-dominates
this one, whereas if we don't emit a .cv_loc directive, we end up using
the potentially invalid location that falls through from the previous
block.
We could probably do better here by emitting some kind of ".cv_loc end"
directive that stops the line table entry of the previous .cv_loc
directive from bleeding out of its basic block. This would improve the
line table when an entire MBB has no valid location info.
llvm-svn: 306889
|
| |
|
|
|
|
|
|
|
| |
It looks like there are two target-independent but not GISel instructions that
need legalization, IMPLICIT_DEF and PHI. These are already anomalies since
their operands have important LLTs attached, so to make things more uniform it
seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF.
llvm-svn: 306875
|
| |
|
|
|
|
|
| |
I'm tired of seeing this:
.globl "?Test@@YAXXZ" # -- Begin function ^A?Test@@YAXXZ
llvm-svn: 306855
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
To enable profile hotness information in diagnostics output, Clang takes
the option `-fdiagnostics-show-hotness` -- that's "diagnostics", with an
"s" at the end. Clang also defines `CodeGenOptions::DiagnosticsWithHotness`.
LLVM, on the other hand, defines
`LLVMContext::getDiagnosticHotnessRequested` -- that's "diagnostic", not
"diagnostics". It's a small difference, but it's confusing, typo-inducing, and
frustrating.
Add a new method with the spelling "diagnostics", and "deprecate" the
old spelling.
Reviewers: anemet, davidxl
Reviewed By: anemet
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D34864
llvm-svn: 306848
|
| |
|
|
|
|
|
| |
This reverts commit r306819 which appears be exposing underlying
issues in a stage1 ppc64be build
llvm-svn: 306820
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.
Tests of note:
* test/CodeGen/X86/build-vector* - Improved.
* test/CodeGen/BPF/undef.ll - Improved store alignment allows an
additional store merge
* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
case we already do not handle well. Here, the DAG is improved, but
scheduling causes a code size degradation.
Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D34472
llvm-svn: 306819
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r301116, a custom lowering needed to be introduced to be able to
legalize 8 and 16-bit divisions on ARM targets without a division
instruction, since 2-step legalization (WidenScalar from 8 bit to 32
bit, then Libcall the 32-bit division) doesn't work.
This fixes this and makes this kind of multi-step legalization, where
first the size of the type needs to be changed and then some action is
needed that doesn't require changing the size of the type,
straighforward to specify.
Differential Revision: https://reviews.llvm.org/D32529
llvm-svn: 306806
|
| |
|
|
|
|
|
|
| |
Reviewer: dblaikie
Differential revision: https://reviews.llvm.org/D34765
llvm-svn: 306771
|
| |
|
|
|
|
| |
https://reviews.llvm.org/D34837
llvm-svn: 306766
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code,
```
BB#1: ; Loop Header
%vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
...
BB#6: ; Loop Latch
%vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
%vreg10<def,tied1> = ADD32rr %vreg1<kill,tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1,%vreg0
%vreg3<def,tied1> = ADD32rr %vreg2<kill,tied0>, %vreg10<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2,%vreg10
CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
%vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
JL_1 <BB#1>, %EFLAGS<imp-use,kill>
```
Existing two-address generation pass generates following code:
```
BB#1:
%vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
...
BB#6:
Predecessors according to CFG: BB#5 BB#4
%vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
%vreg10<def> = COPY %vreg1<kill>; GR32:%vreg10,%vreg1
%vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg0
%vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
%vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
%vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
JL_1 <BB#1>, %EFLAGS<imp-use,kill>
JMP_1 <BB#7>
```
This is suboptimal because the assembly code generated has a redundant copy at the end of #BB6 to feed %vreg13 to BB#1:
```
.LBB0_6:
addl %esi, %edi
addl %ebx, %edi
cmpl $10, %edi
movl %edi, %esi
jl .LBB0_1
```
This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes
```
BB#1:
%vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
...
BB#6: derived from LLVM BB %bb7
Predecessors according to CFG: BB#5 BB#4
%vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
%vreg10<def> = COPY %vreg0<kill>; GR32:%vreg10,%vreg0
%vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg1<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1
%vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
%vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
%vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
JL_1 <BB#1>, %EFLAGS<imp-use,kill>
JMP_1 <BB#7>
```
and the final assembly does not have redundant copy:
```
.LBB0_6:
addl %edi, %eax
addl %ebx, %eax
cmpl $10, %eax
jl .LBB0_1
```
Reviewers: qcolombet, MatzeB, wmi
Reviewed By: wmi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31821
llvm-svn: 306758
|
| |
|
|
|
|
|
| |
This reverts commit r305389. This broke chromium builds, so reverting
while I investigate further.
llvm-svn: 306741
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Arguably non-integral pointers probably shouldn't show up here at all,
but since the backend doesn't complain and this takes valid (according
to the Verifier) IR and makes it invalid, make sure not to introduce
any inttoptr instructions if we're dealing with non-integral pointers.
Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D33110
llvm-svn: 306737
|