| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
LowerLoad. Remove special case code in LegalizeVectorOps that allowed us to only return one result.
Previously we replaced the chain use ourself and return the data result. LegalizeVectorOps then detected that we'd done this and assumed the chain had already been handled.
This commit instead returns a MERGE_VALUES node with two results joined from nodes. This allows LegalizeVectorOps to do all the replacements for us without any special casing. The MERGE_VALUES will be removed by DAG combine.
llvm-svn: 343817
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
outer while loop to revisit.
Summary:
The llvm::SimplifyCFG function creates a SimplifyCFGOpt object and calls run on it. There were numerous places reached from this run function that called back out llvm::SimplifyCFG which would create another SimplifyCFGOpt object. This is an inefficient use of stack space at minimum. We are also not passing along the LoopHeaders pointer passed into the outer llvm::SimplifyCFG call. So if its not null we lose it on the first recursion and get nullptr from there on.
This patch adds an outer loop around the main BasicBlock simplifying code and adds a flag to the SimplifyCFGOpt class that can be set by to request another iteration. I don't think we can iterate based just on the change flag alone since some of the simplifications delete a basic block entirely leaving nothing to iterate on.
Reviewers: bogner, eli.friedman, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52760
llvm-svn: 343816
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This caused out-of-bound bugs. Found by
`-DLLVM_ENABLE_EXPENSIVE_CHECKS=ON`.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D52902
llvm-svn: 343814
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The isAmdCodeObjectV2 is a misleading name which actually checks whether the os
is amdhsa or mesa.
Also add a test to make sure we do not generate old kernel header for code
object v3.
Differential Revision: https://reviews.llvm.org/D52897
llvm-svn: 343813
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can happen if assembling a reference to _GLOBAL_OFFSET_TABLE_.
While it doesn't make sense to try to assemble that for COFF,
the fact that we previously used llvm_unreachable meant that the code
had undefined behaviour if something tried to assemble that.
The configure script of libgmp would try to assemble such a snippet
(which should signal a failure). If llvm is built without assertions,
the undefined behaviour meant a (near) infinite loop.
Differential Revision: https://reviews.llvm.org/D52903
llvm-svn: 343811
|
| |
|
|
| |
llvm-svn: 343806
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This change ensures that the (membername,timestamp) tuple uniquely
identifies an entry in an archive for format=darwin, in deterministic
mode (which is the default).
That, then, enables lldb and dsymutil to locate the appropriate object
within the archive.
Differential Revision: https://reviews.llvm.org/D47659
llvm-svn: 343805
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
combine
This brings the extending loads patch back to the original intent but minus the
PHI bug and with another small improvement to de-dupe truncates that are
inserted into the same block.
The truncates are sunk to their uses unless this would require inserting before a
phi in which case it sinks to the _beginning_ of the predecessor block for that
path (but no earlier than the def).
The reason for choosing the beginning of the predecessor is that it makes de-duping
multiple truncates in the same block simple, and optimized code is going to run a
scheduler at some point which will likely change the position anyway.
llvm-svn: 343804
|
| |
|
|
|
|
|
|
|
|
| |
- Fix spill/reloads of XSeqPairs failing with vregs (only physregs
worked correctly)
- Add missing spill/reload code for WSeqPairs class
Differential Revision: https://reviews.llvm.org/D52761
llvm-svn: 343799
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch matches signed dot4 and dot8 pattern.
Author: FarhanaAleen
Reviewed By: msearles
Differential Revision: https://reviews.llvm.org/D52520
llvm-svn: 343798
|
| |
|
|
|
|
|
|
| |
This is a follow-up to rL343482 / D52439.
This was a pattern that initially caused the commit to be reverted because
the transform requires a bitcast as shown here.
llvm-svn: 343794
|
| |
|
|
|
|
|
|
| |
lowerGlobalAddress, lowerBlockAddress, and insertIndirectBranch contain
overzealous checks for is64Bit. These functions are all safe as-implemented
for RV64.
llvm-svn: 343781
|
| |
|
|
|
|
|
| |
When scalarizing a load, be sure to update the offset in the
MachineMemOperand for each scalar load.
llvm-svn: 343776
|
| |
|
|
|
|
|
|
| |
Replacing Timer* with unique_ptr<Timer> in a pass-to-timer map.
That allows to get rid of unpretty raw deletes in PassTimingInfo destructor.
Strictly cleanup, not intended to change any visible behavior.
llvm-svn: 343772
|
| |
|
|
| |
llvm-svn: 343765
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
f32 values passed on the stack would previously cause an assertion in
unpackFromMemLoc.. This would only trigger in the presence of the F extension
making f32 a legal type. Otherwise the f32 would be legalized.
This patch fixes that by keeping LocVT=f32 when a float is passed on the
stack. It also adds test coverage for this case, and tests that also
demonstrate lw/sw/flw/fsw will be selected when most profitable. i.e. there is
no unnecessary i32<->f32 conversion in registers.
llvm-svn: 343756
|
| |
|
|
|
|
|
|
| |
combineANDXORWithAllOnesIntoANDNP. NFCI
It's the only caller and the logic pretty easy to combine.
llvm-svn: 343754
|
| |
|
|
|
|
|
| |
Rename to lowerRETURNADDR, lowerFRAMEADDR in order to be consistent with the
LLVM coding style and the other functions in this file.
llvm-svn: 343752
|
| |
|
|
| |
llvm-svn: 343750
|
| |
|
|
|
|
|
|
| |
r343712 performed this optimisation during instruction selection. As Eli
Friedman pointed out in post-commit review, implementing this as a DAGCombine
might allow opportunities for further optimisations.
llvm-svn: 343741
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Depends on D52755.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D52805
llvm-svn: 343739
|
| |
|
|
|
|
|
|
|
|
| |
There was some duplicated logic for using the LocInfo of a CCValAssign in
order to convert from the ValVT to LocVT or vice versa. Resolve this by
factoring out convertLocVTFromValVT from unpackFromRegLoc. Also rename
packIntoRegLoc to the more appropriate convertValVTToLocVT and call these
helper functions consistently.
llvm-svn: 343737
|
| |
|
|
|
|
|
|
|
|
|
|
| |
MCContext does not destroy MCSymbols on shutdown. So, rather than putting
SmallVectors (which may heap-allocate) inside MCSymbolWasm, use unowned pointer
to a WasmSignature instead. The signatures are now owned by the AsmPrinter.
Also uses WasmSignature instead of param and result vectors in TargetStreamer,
and leaves some TODOs for further simplification.
Differential Revision: https://reviews.llvm.org/D52580
llvm-svn: 343733
|
| |
|
|
|
|
|
|
| |
If present, PHI nodes must appear before non-PHI nodes in a basic block. The
register allocator relies on this and will fail to eliminate PHI's that do not
meet this requirement.
llvm-svn: 343731
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're a long way from D50992 and D51553, but this is where we have to start.
We weren't back-propagating undefs into binop constant values for anything but
add/sub/mul/and/or/xor.
This is likely because we have to be careful about not introducing UB/poison
with div/rem/shift. But I suspect we already are getting the poison part wrong
for add/sub/mul (although it may not be possible to expose the bug currently
because we use SimplifyDemandedVectorElts from a limited set of opcodes).
See the discussion/implementation from D48987 and D49047.
This patch just enables functionality for FP ops because those do not have
UB/poison potential.
llvm-svn: 343727
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reviewers: kristina, zhmu, dschuff, rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52680
llvm-svn: 343724
|
| |
|
|
|
|
| |
The additional patterns needed for this aren't overwhelming and introducing extra bitcasts during lowering limits our ability to do computeNumSignBits. Not that I have a good example of that for select. I'm just becoming increasingly grumpy about promotion of AND/OR/XOR. SELECT was just a lot easier to fix.
llvm-svn: 343723
|
| |
|
|
|
|
| |
v2i1/v4i1 SELECT into v8i1.
llvm-svn: 343713
|
| |
|
|
|
|
|
|
|
|
|
|
| |
instruction selection
Although we can't write a tablegen pattern to remove redundant
splitf64+buildf64 pairs due to the multiple return values, we can handle it
with some C++ selection code. This is simpler than removing them after
instruction selection through RISCVDAGToDAGISel::PostprocessISelDAG, as was
done previously.
llvm-svn: 343712
|
| |
|
|
|
|
|
|
| |
AVX512VL is enabled.
This allows the phi nodes to be generated with the correct register class when expanded.
llvm-svn: 343710
|
| |
|
|
|
|
| |
The register class is all that's important for the pseudo instructions. We can use patterns to handle the different types.
llvm-svn: 343709
|
| |
|
|
|
|
|
|
|
|
| |
addresses
This patch adds a 'WriteCopy' [WriteLoad, WriteStore] schedule sequence instead to better model the behaviour
Found by @andreadb during llvm-mca testing on btver2 which was crashing on "zero uop" WriteRMW only instructions
llvm-svn: 343708
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Ensure the TemplateParam attribute of the DIGlobalVariable node is translated into the proper DIEs.
Resolves https://bugs.llvm.org/show_bug.cgi?id=22119
Reviewers: dblaikie, probinson, aprantl, JDevlieghere, clayborg, whitequark, deadalnix
Reviewed By: dblaikie
Subscribers: llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D52057
llvm-svn: 343706
|
| |
|
|
|
|
| |
These were being tagged as <WriteALULd, WriteRMW> instead of properly using the RMW sequence
llvm-svn: 343705
|
| |
|
|
|
|
| |
Match AMD Fam16h SOG + llvm-exegesis tests
llvm-svn: 343701
|
| |
|
|
| |
llvm-svn: 343700
|
| |
|
|
| |
llvm-svn: 343697
|
| |
|
|
|
|
|
|
|
|
|
| |
overridable for EXPENSIVE_CHECKS
-verify-machineinstrs was implemented as a simple bool. As a result, the
'VerifyMachineCode == cl::BOU_UNSET' used by EXPENSIVE_CHECKS to make it on by
default but possible to disable didn't work as intended. Changed
-verify-machineinstrs to a boolOrDefault to correct this.
llvm-svn: 343696
|
| |
|
|
|
|
|
|
| |
1. Fix include ordering.
2. Improve variable name (width is bitwidth not number-of-elements).
3. Add local Opcode variable to reduce code duplication.
llvm-svn: 343694
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This fixes a problem where the register allocator fails to eliminate a PHI
because there's a non-PHI in the middle of the PHI instructions at the start
of a BB.
This G_TRUNC can be better placed but this at least fixes the correctness issue
quickly. I'll follow up with a patch to the verifier to catch this kind of bug
in future.
llvm-svn: 343693
|
| |
|
|
|
|
|
| |
This function will deal with more than shuffles with D50992, and I
have another potential per-element fold that could live here.
llvm-svn: 343692
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix use of SSE1 registers for f32 ops in no-x87 mode.
Notably, allow use of SSE instructions for f32 operations in 64-bit
mode (but not 32-bit which is disallowed by callign convention).
Also avoid translating memset/memcopy/memmove into SSE registers
without X87 for 32-bit mode.
This fixes PR38738.
Reviewers: nickdesaulniers, craig.topper
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D52555
llvm-svn: 343689
|
| |
|
|
|
|
| |
Introduce and use a switch on the opcode.
llvm-svn: 343688
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch makes sure that a register is only hinted once to RA. In extreme
cases the same register can otherwise be hinted numerous times and cause a
compile time slowdown.
Review: Simon Pilgrim
https://reviews.llvm.org/D52826
llvm-svn: 343686
|
| |
|
|
|
|
|
|
|
| |
The patterns as defined are correct only when XLen==32.
This is another preparatory patch for a set of patches that flesh out RV64
codegen.
llvm-svn: 343679
|
| |
|
|
|
|
|
|
|
| |
1. brcond operates on an condition.
2. atomic_fence and the pseudo AMO instructions should all take xlen immediates
This allows the same definitions and patterns to work for RV64 (XLenVT==i64).
llvm-svn: 343678
|
| |
|
|
|
|
| |
These patterns are not correct for RV64.
llvm-svn: 343677
|
| |
|
|
| |
llvm-svn: 343676
|
| |
|
|
| |
llvm-svn: 343674
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The new buffer/tbuffer intrinsics handle an out-of-range immediate
offset by moving/adding offset&-4096 to a vgpr, leaving an in-range
immediate offset, with a chance of the move/add being CSEd for similar
loads/stores.
However it turns out that a negative offset in a vgpr is illegal, even
if adding the immediate offset makes it legal again.
Therefore, this commit disables the offset&-4096 thing if the offset is
negative.
Differential Revision: https://reviews.llvm.org/D52683
Change-Id: Ie02f0a74f240a138dc2a29d17cfbd9e350e4ed13
llvm-svn: 343672
|