| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D24546
llvm-svn: 281903
|
| |
|
|
|
|
|
|
|
|
|
| |
Whenever an add/sub immediate needs a fixup, we set that immediate field to zero,
which is correct, but we also set the shift bits to zero, which is not true for
instructions that use lsl #12. This patch makes sure that if lsl #12 was used,
it will appear in the encoding of the instruction.
Differential Revision: https://reviews.llvm.org/D23930
llvm-svn: 281898
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In case s_branch instruction target is itself backend should emit offset -1 but instead it emit 0.
'''
label:
s_branch label // should emit [0xff,0xff,0x82,0xbf]
'''
Tom, Matt: why are we adjusting fixup values in applyFixup() method instead of processFixup()? processFixup() is calling adjustFixupValue() but does nothing with its result.
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl
Differential Revision: https://reviews.llvm.org/D24671
llvm-svn: 281896
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When phi nodes are created in the -mem2reg phase, the @llvm.dbg.declare
entries are converted to @llvm.dbg.value entries at the place where the
store instructions existed. However no entry is created to describe
the resulting value of the phi node.
The effect of this is especially noticeable in for loops which have a
constant for the intial value; the loop control variable's location
would be described as the intial constant value in the loop body once
the -mem2reg optimization phase was run.
This change adds the creation of the @llvm.dbg.value entries to describe
variables whose location is the result of a phi node created in -mem2reg.
Also when the phi node is finally lowered to a machine instruction it
is important that the lowered "load" instruction is placed before the
associated DEBUG_VALUE entry describing the value loaded.
Differential Revision: https://reviews.llvm.org/D23715
llvm-svn: 281895
|
| |
|
|
|
|
|
|
|
|
| |
The initial mapping symbol state is set from the triple, but we only checked
for the little-endian thumb triple, so could end up with an ARM mapping symbol
for big-endian thumb.
Differential Revision: https://reviews.llvm.org/D24553
llvm-svn: 281894
|
| |
|
|
|
|
|
|
|
| |
ldm and stm instructions always require 4-byte alignment on the pointer, but we
weren't checking this before trying to reduce code-size by replacing a
post-indexed load/store with them. Unfortunately, we were also dropping this
incormation in DAG ISel too, but that's easy enough to fix.
llvm-svn: 281893
|
| |
|
|
|
|
|
|
| |
We were updating metadata but not IR flags. Because we pick an arbitrary instruction to be the CSE candidate, it comes down to luck (50% or less chance) if this results in broken codegen or not, which is why PR30373 which is actually not the fault of the commit it was bisected down to.
Fixes PR30373.
llvm-svn: 281889
|
| |
|
|
|
|
|
|
| |
not the output of an instruction.
SUBREG_TO_REG is supposed to indicate that the super register has been zeroed, but we can't prove that if we don't know where it came from.
llvm-svn: 281885
|
| |
|
|
|
|
|
|
| |
supported regardless of whether F16C is also supported.
Still need to add support for lowering using AVX512F when neither VLX or F16C is supported.
llvm-svn: 281884
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter.
This is one of 3 commits to different repositories of XRay ARM port. The other 2 are:
https://reviews.llvm.org/D23932 (Clang test)
https://reviews.llvm.org/D23933 (compiler-rt)
Differential Revision: https://reviews.llvm.org/D23931
llvm-svn: 281878
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Previously we reline on inst-combine to remove inlinable invoke instructions. This causes trouble because a few extra optimizations are schedule early that could introduce too much CFG change (e.g. simplifycfg removes too much control flow). This patch handles invoke instruction in-place during sample profile annotation, so that we do not rely on instcombine to remove those invoke instructions.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24409
llvm-svn: 281870
|
| |
|
|
|
|
|
|
| |
rounding mode encoding in the second operand. This immediate should only be 0 or 1 and indicates if the truncation loses precision.
Also enhance an assert in SelectionDAG::getNode to flag this sort of problem in the future.
llvm-svn: 281868
|
| |
|
|
|
|
| |
second operand containing an X86 specific rounding mode encoding that doesn't belong.
llvm-svn: 281867
|
| |
|
|
|
|
| |
libFuzzer
llvm-svn: 281866
|
| |
|
|
| |
llvm-svn: 281865
|
| |
|
|
| |
llvm-svn: 281862
|
| |
|
|
|
|
| |
conversion intrinsics to be consistent across all intruction sets.
llvm-svn: 281861
|
| |
|
|
|
|
| |
correct register class.
llvm-svn: 281860
|
| |
|
|
| |
llvm-svn: 281859
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D23727
llvm-svn: 281858
|
| |
|
|
|
|
|
|
|
| |
Amended consecutive memory access detection in Loop Vectorizer.
Load/Store were not handled properly without preceding GEP instruction.
Differential Revision: https://reviews.llvm.org/D20789
llvm-svn: 281853
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
as sitofp
With D24253 we can now use SelectionDAG::SignBitIsZero with vector operations.
This patch uses SelectionDAG::SignBitIsZero to recognise that a zero sign bit means that we can use a sitofp instead of a uitofp (which is not directly support on pre-AVX512 hardware).
While AVX512 does provide support for uitofp, the conversion to sitofp should not cause any regressions.
Differential Revision: https://reviews.llvm.org/D24343
llvm-svn: 281852
|
| |
|
|
|
|
|
|
|
| |
Simplified GEP cloning in vectorizeMemoryInstruction().
Added an assertion that checks consecutive GEP, which should have only one loop-variant operand.
Differential Revision: https://reviews.llvm.org/D24557
llvm-svn: 281851
|
| |
|
|
|
|
|
| |
It is a trivial change which could make the testcase easier to be reused
for the store splitting in CodeGenPrepare.
llvm-svn: 281846
|
| |
|
|
|
|
| |
the guard type to intptr_t; use separate array for 8-bit counters
llvm-svn: 281845
|
| |
|
|
| |
llvm-svn: 281843
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes an issue when files are compiled with -flto=thin
at default -O0. We need to rename anonymous globals before attempting
to write the module summary because all values need names for
the summary. This was happening at -O1 and above, but not before
the early exit when constructing the pipeline for -O0.
Also add an internal -prepare-for-thinlto option to enable this
to be tested via opt.
Fixes PR30419.
Reviewers: mehdi_amini
Subscribers: probinson, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D24701
llvm-svn: 281840
|
| |
|
|
|
|
| |
Add ability to extract vXi64 'vzext_movl' masks on 32-bit targets
llvm-svn: 281834
|
| |
|
|
|
|
| |
Added BoundaryNode check to isBestZeroLatency function.
llvm-svn: 281825
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were trying to avoid using a FrameIndex operand in non-pointer
operands in a convoluted way, and would break because of
using TargetFrameIndex. The TargetFrameIndex should only be used
in the case where it makes sense to fold it as part of the addressing
mode, otherwise it requires materialization like a normal constant.
This wasn't working reliably and failed in the added testcase, hitting
the assert when processing the frame index.
The TargetFrameIndex was coming from trying to produce an AssertZext
limiting the maximum stack size. I'm not sure this was correct to begin
with, because it is apparently possible to have a single workitem
dispatch that requires all 4G of private memory.
llvm-svn: 281824
|
| |
|
|
| |
llvm-svn: 281823
|
| |
|
|
|
|
|
|
| |
This reduces the number of copies and reg_sequences
when using fp constant vectors. This significantly
reduces the code size in local-stack-alloc-bug.ll
llvm-svn: 281822
|
| |
|
|
|
|
| |
to fix check-fuzzer on the bot
llvm-svn: 281814
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
names (NFC)
The ValueSymbolTable is used to detect name conflict and rename
instructions automatically. This is not needed when the value
names are automatically discarded by the LLVMContext.
No functional change intended, just saving a little bit of memory.
This is a recommit of r281806 after fixing the accessor to return
a pointer instead of a reference and updating all the call-sites.
llvm-svn: 281813
|
| |
|
|
|
|
| |
Last-second refactoring before push was bad idea...
llvm-svn: 281812
|
| |
|
|
|
|
| |
This is in line with the LLParser behavior
llvm-svn: 281811
|
| |
|
|
| |
llvm-svn: 281810
|
| |
|
|
| |
llvm-svn: 281809
|
| |
|
|
|
|
|
|
|
| |
value names (NFC)"
This reverts commit r281806. It introduces undefined behavior as an
API is returning a reference to the Symtab
llvm-svn: 281808
|
| |
|
|
|
|
|
|
|
|
|
| |
names (NFC)
The ValueSymbolTable is used to detect name conflict and rename
instructions automatically. This is not needed when the value
names are automatically discarded by the LLVMContext.
No functional change intended, just saving a little bit of memory.
llvm-svn: 281806
|
| |
|
|
|
|
| |
VI added eq/ne for i64, so use them.
llvm-svn: 281800
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: i8, i16, and f16 values are not extended to 32-bit in the HSA kernel ABI.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl
Differential Revision: https://reviews.llvm.org/D24621
llvm-svn: 281789
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shuffle
As discussed on llvm-dev ( http://lists.llvm.org/pipermail/llvm-dev/2016-August/104210.html ):
turn a vector select with constant condition operand into a shuffle as a canonicalization step.
Shuffles may be easier to reason about in conjunction with other shuffles and insert/extract.
Possible known (minor?) regressions from this change are filed as:
https://llvm.org/bugs/show_bug.cgi?id=28530
https://llvm.org/bugs/show_bug.cgi?id=28531
https://llvm.org/bugs/show_bug.cgi?id=30371
If something terrible happens to perf after this commit, feel free to revert until a backend
fix is in place.
Differential Revision: https://reviews.llvm.org/D24279
llvm-svn: 281787
|
| |
|
|
|
|
|
|
|
|
|
| |
These clean up some unnecessary or instructions in
cases with complex loops.
In the original testcase I noticed this, the same
or with exec was repeated 5 or 6 times in a row. With
this only one is emitted or sometimes a copy.
llvm-svn: 281786
|
| |
|
|
|
|
|
|
|
|
| |
This is a fix for PR30318.
Clang may generate IR where an alloca is already live when entering a
BB with lifetime.start. In this case, conservatively extend the
alloca lifetime all the way back to the block entry.
llvm-svn: 281784
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When trying to recolor a register we may split live-ranges in the
process. When we create new live-ranges we will have to process them,
but when we move a register from Assign to Split, the allocation is not
changed until the whole recoloring session is successful.
Therefore, only push the live-ranges that changed from Assign to
Split when the recoloring is successful.
Same as the previous commit, I was not able to produce a test case that
reproduce the problem with in-tree targets.
Note: The bug has been here since the recoloring scheme has been added
back in r200883 (Feb 2014).
llvm-svn: 281783
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is used.
When last chance recoloring is used, the list of NewVRegs may not be
empty when calling selectOrSplitImpl. Indeed, another coloring may have
taken place with splitting/spilling in the same recoloring session.
Relax an assertion to take this into account and adapt a condition to
act as if the NewVRegs were local to this selectOrSplitImpl instance.
Unfortunately I am unable to produce a test case for this, I was only
able to reproduce the conditions on an out-of-tree target.
llvm-svn: 281782
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The main challenge in lowering kernel arguments for AMDGPU is determing the
memory type of the argument. The generic calling convention code assumes
that only legal register types can be stored in memory, but this is not the
case for AMDGPU.
This consolidates all the logic AMDGPU uses for deducing memory types into a single
function. This will make it much easier to support different ABIs in the future.
Reviewers: arsenm
Subscribers: arsenm, wdng, nhaehnle, llvm-commits, yaxunl
Differential Revision: https://reviews.llvm.org/D24614
llvm-svn: 281781
|
| |
|
|
| |
llvm-svn: 281780
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
mesa3d will use the same kernel calling convention as amdhsa, but it will
handle everything else like the default 'unknown' OS type.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D22783
llvm-svn: 281779
|