| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nothing prevents entries from being bigger than the 16 bit size field in
Dwarf < 5. For entries that are too big, just emit an empty entry
instead of crashing.
This fixes PR41038.
Reviewers: probinson, aprantl, davide
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D59518
llvm-svn: 356514
|
| |
|
|
|
|
|
|
|
|
| |
LTO provides additional opportunities for tailcall elimination due to
link-time inlining and visibility of nocapture attribute. Testing showed
negligible impact on compilation times.
Differential Revision: https://reviews.llvm.org/D58391
llvm-svn: 356511
|
| |
|
|
|
|
|
|
| |
Teach instcombine to propagate demanded elements through a masked load or masked gather instruction. This is in the broader context of improving vector pointer instcombine under https://reviews.llvm.org/D57140.
Differential Revision: https://reviews.llvm.org/D57372
llvm-svn: 356510
|
| |
|
|
|
|
|
|
|
|
| |
This will allow targets more flexibility to replace the
register allocator core passes. In a future commit,
AMDGPU will run the core register assignment passes
twice, and will also want to disallow using the
standard -regalloc option.
llvm-svn: 356506
|
| |
|
|
|
|
|
|
|
|
| |
Do not actually allocate a register for an undef use. Previously we we
would create unnecessary reload instruction for undef uses where the
register wasn't live.
Patch by Matthias Braun
llvm-svn: 356501
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cost 0 anyway for free regs
The 2nd loop calculates spill costs but reports free registers as cost
0 anyway, so there is little benefit from having a separate early
loop.
Surprisingly this is not NFC, as many register are marked regDisabled
so the first loop often picks up later registers unnecessarily instead
of the first one available in the allocation order...
Patch by Matthias Braun
llvm-svn: 356499
|
| |
|
|
| |
llvm-svn: 356498
|
| |
|
|
|
|
|
|
|
|
| |
The actual code change is fairly straight forward, but exercising it isn't. First, it turned out we weren't adding the appropriate flags in SelectionDAG. Second, it turned out that we've got some optimization gaps, so obvious test cases don't work.
My first attempt (in atomic-unordered.ll) points out a deficiency in our peephole-opt folding logic which I plan to fix separately. Instead, I'm exercising this through MachineLICM.
Differential Revision: https://reviews.llvm.org/D59375
llvm-svn: 356494
|
| |
|
|
|
|
|
|
|
| |
This reverts commit 51dc6a8c84cd6a58562e320e1828a0158dbbf750.
Breaks
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20034/steps/build%20stage%201/logs/stdio.
llvm-svn: 356492
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a Remark class that allows us to share code when working with
remarks.
The C API has been updated to reflect this. Instead of the parser
generating C structs, it's now using a C++ object that is used through
opaque pointers in C. This gives us much more flexibility on what
changes we can make to the internal state of the object and interacts
much better with scenarios where the library is used through dlopen.
* C API updates:
* move from C structs to opaque pointers and functions
* the remark type is now an enum instead of a string
* unit tests updates:
* use mostly the C++ API
* keep one test for the C API
* rename to YAMLRemarksParsingTest
* a typo was fixed: AnalysisFPCompute -> AnalysisFPCommute.
* a new error message was added: "expected a remark tag."
* llvm-opt-report has been updated to use the C++ parser instead of the
C API
Differential Revision: https://reviews.llvm.org/D59049
llvm-svn: 356491
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve computeOverflowForUnsignedAdd/Sub in ValueTracking by
intersecting the computeConstantRange() result into the ConstantRange
created from computeKnownBits(). This allows us to detect some
additional never/always overflows conditions that can't be determined
from known bits.
This revision also adds basic handling for constants to
computeConstantRange(). Non-splat vectors will be handled in a followup.
The signed case will also be handled in a followup, as it needs some
more groundwork.
Differential Revision: https://reviews.llvm.org/D59386
llvm-svn: 356489
|
| |
|
|
|
|
|
|
| |
amounts
If a value with multiple uses is only ever used for SSE shift amounts then we know that only the bottom 64-bits are needed.
llvm-svn: 356483
|
| |
|
|
|
|
| |
Add tests for wider atomic loads and stores. In the process, fix a crasher where we appearently handled unorder stores, but not loads, when lowering to cmpxchg idioms.
llvm-svn: 356482
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Dynamic stack realignment was disabled on micromips by checking if
target has standard encoding. We simply change the condition to skip
Mips16 only.
Patch by Mirko Brkusanin.
Differential Revision: http://reviews.llvm.org/D59499
llvm-svn: 356478
|
| |
|
|
|
|
| |
This was introduced in rL356468.
llvm-svn: 356477
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r311255 we added a case where we split vectors whose elements are
all derived from the same input vector so that we could shuffle it
more efficiently. In doing so, createBuildVecShuffle was taught to
adjust for the fact that all indices would be based off of the first
vector when this happens, but it's possible for the code that checked
that to fire incorrectly if we happen to have a BUILD_VECTOR of
extracts from subvectors and don't hit this new optimization.
Instead of trying to detect if we've split the vector by checking if
we have extracts from the same base vector, we can just pass that
information into createBuildVecShuffle, avoiding the miscompile.
Differential Revision: https://reviews.llvm.org/D59507
llvm-svn: 356476
|
| |
|
|
| |
llvm-svn: 356474
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Combine 2 fcmps that are checking for nan-ness:
and (fcmp ord X, 0), (and (fcmp ord Y, 0), Z) --> and (fcmp ord X, Y), Z
or (fcmp uno X, 0), (or (fcmp uno Y, 0), Z) --> or (fcmp uno X, Y), Z
This is an exact match for a minimal reassociation pattern.
If we want to handle this more generally that should go in
the reassociate pass and allow removing this code.
This should fix:
https://bugs.llvm.org/show_bug.cgi?id=41069
llvm-svn: 356471
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:
SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.
Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.
Add promoting the integer ABS node in the LegalizeIntegerType.
Expand-based legalization of integer result for the ABS nodes.
Expand-based legalization of ABS vector operations.
Add some integer abs testcases for different typesizes for Thumb arch
Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
tmp = (SRA, Hi, 31)
Lo = (UADDO tmp, Lo)
Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
Lo = (XOR tmp, Lo)
The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
(ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).
Change integer abs testcases for codegen with the ABS node support for AArch64.
Indicate that the ABS is legal for the i64 type when the NEON is supported.
Change the integer abs testcases to show changing of codegen.
Add combine and legalization of ABS nodes for Thumb arch.
Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.
For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743
Patch by: @ikulagin (Ivan Kulagin)
Differential Revision: https://reviews.llvm.org/D49837
llvm-svn: 356468
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add buffer store/load 8/16 overloaded intrinsics for buffer, raw_buffer and struct_buffer
Change-Id: I166a29f071b2ff4e4683fb0392564b1f223ac61d
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59265
llvm-svn: 356465
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found this really weird WWM-related case whereby through the WWM
transformations our isel lowering was trying to promote 2 min's into a
min3 for the i8 type, which our hardware doesn't support.
The new min3_i8.ll test case would previously spew the error:
PromoteIntegerResult #0: t69: i8 = SMIN3 t70, Constant:i8<0>, t68
Before the simple fix to our isel lowering to not do it for i8 MVT's.
Differential Revision: https://reviews.llvm.org/D59543
llvm-svn: 356464
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Switch to the `MCParserUtils::parseAssignmentExpression` for parsing
assignment expressions in the `.set` directive reduces code and allows
to print an error message instead of crashing in case of incorrect
recursive using of the `.set`.
Fix for the bug https://bugs.llvm.org/show_bug.cgi?id=41053.
Differential Revision: http://reviews.llvm.org/D59452
llvm-svn: 356461
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand:
When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction)
When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst).
Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef.
The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968.
Differential Revision: https://reviews.llvm.org/D59541
llvm-svn: 356456
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Moving subprogram specific flags into DISPFlags makes IR code more readable.
In addition, we provide free space in DIFlags for other
'non-subprogram-specific' debug info flags.
Patch by Djordje Todorovic.
Differential Revision: https://reviews.llvm.org/D59288
llvm-svn: 356454
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows
for a convenient way to perform type conversions on the Dwarf expression
stack. As an additional bonus it paves the way for using other Dwarf
v5 ops that need to reference a base_type.
The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp
to perform sext/zext on debug values but mainly the patch is about
preparing terrain for adding other Dwarf v5 ops that need to reference a
base_type.
For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a
complex shift & mask pattern is generated to emulate sext/zext.
This is a recommit of r356442 with trivial fixes for the failing tests.
Differential Revision: https://reviews.llvm.org/D56587
llvm-svn: 356451
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 1cf4b593a7ebd666fc6775f3bd38196e8e65fafe.
Build bots found failing tests not detected locally.
Failing Tests (3):
LLVM :: DebugInfo/Generic/convert-debugloc.ll
LLVM :: DebugInfo/Generic/convert-inlined.ll
LLVM :: DebugInfo/Generic/convert-linked.ll
llvm-svn: 356444
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows
for a convenient way to perform type conversions on the Dwarf expression
stack. As an additional bonus it paves the way for using other Dwarf
v5 ops that need to reference a base_type.
The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp
to perform sext/zext on debug values but mainly the patch is about
preparing terrain for adding other Dwarf v5 ops that need to reference a
base_type.
For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a
complex shift & mask pattern is generated to emulate sext/zext.
Differential Revision: https://reviews.llvm.org/D56587
llvm-svn: 356442
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Make some class member methods const
- Delete unnecessary includes
- Use a simpler form of `BuildMI`
Reviewers: kripken
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59454
llvm-svn: 356440
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: tlively, sbc100
Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59469
llvm-svn: 356438
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds patterns to lower all the remaining setcc modes: lt, gt,
le, and ge. Fixes PR40912.
Reviewers: aheejin, sbc100, dschuff
Reviewed By: dschuff
Subscribers: jgravelle-google, hiraditya, sunfish, jdoerfert, llvm-commits, srj
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59519
llvm-svn: 356431
|
| |
|
|
|
|
|
|
|
|
| |
computeConstantRange()"
This reverts commit 106f0cdefb02afc3064268dc7a71419b409ed2f3.
This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests.
llvm-svn: 356424
|
| |
|
|
|
|
| |
64-bit mode.
llvm-svn: 356419
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This was suggested by spatel as an alternative
approach to D59378. I've also added the infinite looping test from
that revision here.
Differential Revision: https://reviews.llvm.org/D59506
llvm-svn: 356415
|
| |
|
|
|
|
|
|
| |
extended 8-bit immediates.
We need to allow [128,255] in addition to [-128, 127] to match gas.
llvm-svn: 356413
|
| |
|
|
|
|
| |
Forgot to add a change to relax some asserts in r356396.
llvm-svn: 356411
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFC.
The default implementation does we want and is going to more compatible
with dynamic linking (-fPIC) support that is planned.
This is NFC because currently we only build wasm with
`-relocation-model=static` which in turn means that the default
`isOffsetFoldingLegal` always returns true today.
Differential Revision: https://reviews.llvm.org/D54661
llvm-svn: 356410
|
| |
|
|
|
|
|
|
|
|
|
| |
This is preparation for D59506. The InstructionSimplify abs handling
is moved into computeConstantRange(), which is the general place for
such calculations. This is NFC and doesn't affect the existing tests
in test/Transforms/InstSimplify/icmp-abs-nabs.ll.
Differential Revision: https://reviews.llvm.org/D59511
llvm-svn: 356409
|
| |
|
|
|
|
| |
consistent with the ROR patterns.
llvm-svn: 356407
|
| |
|
|
|
|
| |
For the i8, i16, and i32 instructions we were using a relocImm. Presumably we should for i64 as well.
llvm-svn: 356406
|
| |
|
|
|
|
|
|
|
|
| |
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59501
llvm-svn: 356405
|
| |
|
|
|
|
|
|
| |
This change makes linking into .build-id atomic and safe to use.
Some users under particular workflows are reporting that this races
more than half the time under particular conditions.
llvm-svn: 356404
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow the clamp modifier on vop3 int arithmetic instructions in assembly
and disassembly.
This involved adding a clamp operand to the affected instructions in MIR
and MC, and thus having to fix up several places in codegen and MIR
tests.
Differential Revision: https://reviews.llvm.org/D59267
Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e
llvm-svn: 356399
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit allows v_cndmask_b32_e64 with abs, neg source
modifiers on src0, src1 to be assembled and disassembled.
This does appear to be allowed, even though they are floating point
modifiers and the operand type is b32.
To do this, I added src0_modifiers and src1_modifiers to the
MachineInstr, which involved fixing up several places in codegen and mir
tests.
Differential Revision: https://reviews.llvm.org/D59191
Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea
llvm-svn: 356398
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
After review comments, it was preferred to not teach MachineIRBuilder about
non-generic instructions beyond using buildInstr().
For AArch64 I've changed the buildCopy() calls to buildInstr() + a
separate addReg() call.
This also relaxes the MachineIRBuilder's COPY checking more because it may
not always have a SrcOp given to it.
llvm-svn: 356396
|
| |
|
|
|
|
|
|
|
|
|
| |
Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count).
With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention.
Also fix the * Linker * contrib section which wasn't correctly emitted previously.
Differential Revision: https://reviews.llvm.org/D59502
llvm-svn: 356395
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
checks bot
This fixes a couple of unflushed raw_string_ostream bugs in recent
commits that only show up on a bot building on windows with expensive
checks.
Differential Revision: https://reviews.llvm.org/D59396
Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf
llvm-svn: 356394
|
| |
|
|
|
|
| |
relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are.
llvm-svn: 356393
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reinstates r347934, along with a tweak to address a problem with
PHI node ordering that that commit created (or exposed). (That commit
was reverted at r348426, due to the PHI node issue.)
Original commit message:
r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.
This fixes PR30806.
Differential Revision: https://reviews.llvm.org/D57428
llvm-svn: 356392
|
| |
|
|
|
|
|
|
|
|
|
| |
It uses the generic AArch64_IMM::expandMOVImm to get the correct
number of instruction used in immediate materialization.
Reviewers: efriedma
Differential Revision: https://reviews.llvm.org/D58461
llvm-svn: 356391
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch follows some ideas from r352866 to optimize the floating
point materialization even further. It changes isFPImmLegal to
considere up to 2 mov instruction or up to 5 in case subtarget has
fused literals.
The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but
the mov+fmov sequence is always better because of the reduced d-cache
pressure. The timings are still the same if you consider movw+movk+fmov
vs. adrp+ldr will be fused (although one instruction longer).
Reviewers: efriedma
Differential Revision: https://reviews.llvm.org/D58460
llvm-svn: 356390
|