| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a zero register.
Previously I tried this and saw LLVM unable to transform this to fold
with memory operands such as spill slot rematerialization. However, it
clearly works as shown in this patch. We turn these into `cmpb $0,
<mem>` when useful for folding a memory operand without issue. This form
has no disadvantage compared to `testb $-1, <mem>`. So overall, this is
likely no worse and may be slightly smaller in some cases due to the
`testb %reg, %reg` form.
Differential Revision: https://reviews.llvm.org/D45475
llvm-svn: 330269
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
across basic blocks in the limited cases where it is very straight
forward to do so.
This will also be useful for other places where we do some limited
EFLAGS propagation across CFG edges and need to handle copy rewrites
afterward. I think this is rapidly approaching the maximum we can and
should be doing here. Everything else begins to require either heroic
analysis to prove how to do PHI insertion manually, or somehow managing
arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these
seem at all promising so if those cases come up, we'll almost certainly
need to rewrite the parts of LLVM that produce those patterns.
We do now require dominator trees in order to reliably diagnose patterns
that would require PHI nodes. This is a bit unfortunate but it seems
better than the completely mysterious crash we would get otherwise.
Differential Revision: https://reviews.llvm.org/D45673
llvm-svn: 330264
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A change to use divergence analysis in the AMDGPU backend was getting formal
arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or
VGPR2
For graphics shaders it is possible to have more than these passed in as VGPR
Modified the checking code to check for any VGPR registers passed in as formal
arguments.
Also, some intrinsics that are sources of divergence may have been lowered
during instruction selection and are missed on subsequent calls to
isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well.
Finally, the FunctionLoweringInfo tracks virtual registers that are live across
basic block boundaries. This is used to check for divergence of CopyFromRegister
registers using the DivergenceAnalysis analysis. For multiple blocks the lazily
evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45372
Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3
llvm-svn: 330257
|
| |
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D45727
llvm-svn: 330253
|
| |
|
|
| |
llvm-svn: 330239
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Previously if a modifer was placed on a non-GPR register class we would hit an assert or crash.
Reviewers: echristo
Reviewed By: echristo
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D45751
llvm-svn: 330238
|
| |
|
|
|
|
|
|
| |
Literal encoding needs op_sel_hi to select low 16 bit in this case.
Differential Revision: https://reviews.llvm.org/D45745
llvm-svn: 330230
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation follows the MIPS backend and expands the
pseudo instruction directly during asm parsing. As the result, only
real MC instructions are emitted to the MCStreamer. Additionally,
PseudoLI instructions are emitted during codegen. The actual
expansion to real instructions is performed during MI to MC lowering
and is similar to the expansion performed by the GNU Assembler.
Differential Revision: https://reviews.llvm.org/D41949
Patch by Mario Werner.
llvm-svn: 330224
|
| |
|
|
|
|
|
|
|
| |
When we skip bitcasts while looking for GEP in LoadSoreVectorizer
we should also verify that the type is sized otherwise we assert
Differential Revision: https://reviews.llvm.org/D45709
llvm-svn: 330221
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add an LLVM intrinsic for type discriminated event logging with XRay.
Similar to the existing intrinsic for custom events, but also accepts
a type tag argument to allow plugins to be aware of different types
and semantically interpret logged events they know about without
choking on those they don't.
Relies on a symbol defined in compiler-rt patch D43668. I may wait
to submit before I can see demo everything working together including
a still to come clang patch.
Reviewers: dberris, pelikan, eizan, rSerge, timshen
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45633
llvm-svn: 330219
|
| |
|
|
|
|
| |
Fixes PR36564.
llvm-svn: 330215
|
| |
|
|
| |
llvm-svn: 330204
|
| |
|
|
|
|
| |
mcpu exposes other tuning flags. These tests are only trying to test instruction set features so it is better to use mattr.
llvm-svn: 330196
|
| |
|
|
|
|
|
|
|
| |
(take 2)"
Revert in order to fix the test to not run when required targets aren't
configured.
llvm-svn: 330193
|
| |
|
|
|
|
|
|
| |
Add the accidentally omitted testcase.
Differential revision: https://reviews.llvm.org/D45524
llvm-svn: 330192
|
| |
|
|
|
|
|
|
|
| |
Stack addressing needs addressing modes that provide an offset field
immediately following the frame index. An initializer from a non-stack
addressing could force the stack address to use a form that does not
provide an offset field.
llvm-svn: 330191
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Duplicating this intrinsic is not generally valid because it has the side-effect
of decrementing the CTR. Any passes that duplicate it would need to be taught to
keep the regions formed completely disjoint.
This patch should be NFC for typical uses as CTRLoops runs after the remaining
loop passes. It only affects situations where the loop passes are scheduled on
the IR after the codegen passes (as is the case with some JIT pipelines).
Fixes https://bugs.llvm.org/show_bug.cgi?id=37050
llvm-svn: 330186
|
| |
|
|
|
|
|
|
| |
Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd
Differential Revision: https://reviews.llvm.org/D45656
llvm-svn: 330179
|
| |
|
|
| |
llvm-svn: 330178
|
| |
|
|
| |
llvm-svn: 330115
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Transforms the following:
%vreg1234:gpr32 = COPY %42
%vreg1235:gpr32 = COPY %vreg1234
%vreg1236:gpr32 = COPY %vreg1235
$w0 = COPY %vreg1236
into:
$w0 = COPY %42
Assuming %42 is also a gpr32
llvm-svn: 330113
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using Goldmont's cost tables for these two upcoming
atom archs.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45612
llvm-svn: 330109
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is a transform that I limited in instcombine in rL329821 because it was
creating more instructions in IR when the cast has multiple uses.
But if the cast is free, then we can do the transform regardless of other
uses because it improves the potential throughput of the calculation by
removing a dependency on the fneg.
Differential Revision: https://reviews.llvm.org/D45598
llvm-svn: 330098
|
| |
|
|
|
|
| |
One more test for smax(smin(x, C2), C1) pattern.
llvm-svn: 330090
|
| |
|
|
| |
llvm-svn: 330088
|
| |
|
|
|
|
| |
classes as SSE/AVX
llvm-svn: 330085
|
| |
|
|
|
|
|
| |
Debugability is more important than saving 4 bytes to let us to fall
through to nonense.
llvm-svn: 330073
|
| |
|
|
|
|
|
|
|
|
| |
addresses."
Caused a hang and eventually an assertion failure in LTO builds
of 7zip-benchmark on aarch64 iOS targets.
http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/
llvm-svn: 330063
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the MIPS backend would alwyas break down constant multiplications
into a series of shifts, adds, and subs. This patch changes that so the cost of
doing so is estimated.
The cost is estimated against worst case constant materialization and retrieving
the results from the HI/LO registers.
For cases where the value type of the multiplication is not legal, the cost of
legalization is estimated and is accounted for before performing the
optimization of breaking down the constant
This resolves PR36884.
Thanks to npl for reporting the issue!
Reviewers: abeserminji, smaksimovic
Differential Revision: https://reviews.llvm.org/D45316
llvm-svn: 330037
|
| |
|
|
|
|
|
|
|
| |
This adds code generation support for the FP16 vmaxnm/vminnm scalar
instructions.
Differential Revision: https://reviews.llvm.org/D44675
llvm-svn: 330034
|
| |
|
|
|
|
| |
WriteFAdd
llvm-svn: 330023
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change extend the register dependency check for implicit operands in Copy instructions.
Fixes PR36902.
Reviewers: thegameg, sebpop, uweigand, jnspaulsson, gberry, mcrosier, qcolombet, MatzeB
Reviewed By: thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44958
llvm-svn: 330018
|
| |
|
|
|
|
|
|
| |
instruction
Differential Revision: https://reviews.llvm.org/D45514
llvm-svn: 330011
|
| |
|
|
|
|
| |
"the the" -> "the", "we we" -> "we", etc
llvm-svn: 330006
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hint to hardware to move the cache line containing the
address to a more distant level of the cache without
writing back to memory.
Reviewers: craig.topper, zvi
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D45256
llvm-svn: 329992
|
| |
|
|
|
|
|
|
| |
This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics.
We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well.
llvm-svn: 329990
|
| |
|
|
|
|
|
|
|
| |
This is a test for a transform that was suggested in the post-commit
mailing list thread for rL329821. The target in question is not in
trunk, so PPC gets to stand in for it because it's the only in-tree
target that sets 'isFPExtFree()' to 'true'.
llvm-svn: 329963
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a code size win in code that takes offseted addresses
frequently, such as C++ constructors that typically need to compute
an offseted address of a vtable. This reduces the size of Chromium
for Android's .text section by 108KB.
Differential Revision: https://reviews.llvm.org/D45199
llvm-svn: 329956
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A previously missing intrinsic for an old instruction.
Reviewers: craig.topper, echristo
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D45312
llvm-svn: 329936
|
| |
|
|
|
|
|
|
|
|
|
| |
Legalize and emit code for:
* xscvsdqp
* xscvudqp
Differential Revision: https://reviews.llvm.org/D45230
llvm-svn: 329931
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
AFI->setRedZone(false) was put in the wrong place before, and so it only fired
on functions that didn't have stack frames. This moves that to the top of
emitPrologue to make sure that every function without a redzone has it set
correctly.
This also adds a function representing one of the early exit cases (GHC calling
convention) to the MachineOutliner noredzone test to ensure that we can outline
from functions like these, where we never use a redzone.
llvm-svn: 329922
|
| |
|
|
|
|
|
| |
This change is exposing UB in source code - as was warned/predicted. :)
See D44909 for discussion. Reverting while we figure out how to fix things.
llvm-svn: 329920
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
According RISC-V ELF psABI specification, base RV32 and RV64 ISAs only
allow 32-bit instruction alignment, but instruction allow to be aligned
to 16-bit boundaries for C-extension.
So we just align to 4 bytes and 2 bytes for C-extension is enough.
Reviewers: asb, apazos
Differential Revision: https://reviews.llvm.org/D45560
Patch by Kito Cheng.
llvm-svn: 329899
|
| |
|
|
|
|
|
|
| |
Remove 'registers' section, as suggested (D. Sanders) at code review
https://reviews.llvm.org/D44304
llvm-svn: 329888
|
| |
|
|
|
|
| |
"is is" -> "is", "if if" -> "if", "or or" -> "or"
llvm-svn: 329878
|
| |
|
|
|
|
|
|
| |
Also add double-prevoius-failure.ll which captures a test case that at one
point triggered a compiler crash, while developing calling convention support
for f64 on RV32D with soft-float ABI.
llvm-svn: 329877
|
| |
|
|
|
|
|
| |
This also includes support and a test for truncating stores, which are now
possible thanks to the fpround pattern.
llvm-svn: 329876
|
| |
|
|
| |
llvm-svn: 329874
|
| |
|
|
| |
llvm-svn: 329872
|
| |
|
|
|
|
|
|
|
|
| |
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=37039
The condition only covers one of the two 64-bit rotate instructions. This just
adds the second (RLDICLo).
Patch by Josh Stone.
llvm-svn: 329852
|