| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
| |
This was failing on out of bounds access to the extra operands
on the s_swappc_b64 beyond those in the instruction definition.
This was working, but somehow regressed within the past few weeks,
although I don't see any obvious commit.
llvm-svn: 309782
|
| |
|
|
| |
llvm-svn: 309781
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In ThinLTO backend compile, OPTOptions are not set so that the ICP in ThinLTO backend does not know if it is a SamplePGO build, in which profile count needs to be annotated directly on call instructions. This patch cleaned up the PGOOptions handling logic and passes down PGOOptions to ThinLTO backend.
Reviewers: chandlerc, tejohnson, davidxl
Reviewed By: chandlerc
Subscribers: sanjoy, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D36052
llvm-svn: 309780
|
| |
|
|
|
|
|
|
|
|
| |
Previous change "Turn s_and_saveexec_b64 into s_and_b64 if
result is unused" introduced asan use-after-poison error.
Instruction was analyzed after eraseFromParent() calls.
Move analysys higher than erase.
llvm-svn: 309779
|
| |
|
|
|
|
| |
Distribute various expressions across ifs.
llvm-svn: 309777
|
| |
|
|
|
|
|
|
| |
When finding the fixed offsets for function arguments,
this needs to skip over the 4 bytes reserved for the
emergency stack slot.
llvm-svn: 309776
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pattern shows up when lowering byval copies on AMDGPU.
The byval object access is split into 4-byte chunks, adding a
constant offset to the FixedStack base. When some of the offsets
turn into ors, this prevents combining the constant offsets.
This makes it not apparent that the object is there when matching
addressing modes, so it ends up using a scratch wave offset
relative access and the lengthy frame index expansion for that.
llvm-svn: 309775
|
| |
|
|
|
|
|
|
|
|
|
| |
instead of using the deprecated offset field of DBG_VALUE.
This has no observable effect on the generated DWARF, but the
assembler comments will look different.
rdar://problem/33580047
llvm-svn: 309773
|
| |
|
|
|
|
|
|
|
|
| |
With SI_END_CF elimination for some nested control flow we can now
eliminate saved exec register completely by turning a saveexec version
of instruction into just a logical instruction.
Differential Revision: https://reviews.llvm.org/D36007
llvm-svn: 309766
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a pass to remove redundant S_OR_B64 instructions enabling lanes in
the exec. If two SI_END_CF (lowered as S_OR_B64) come together without any
vector instructions between them we can only keep outer SI_END_CF, given
that CFG is structured and exec bits of the outer end statement are always
not less than exec bit of the inner one.
This needs to be done before the RA to eliminate saved exec bits registers
but after register coalescer to have no vector registers copies in between
of different end cf statements.
Differential Revision: https://reviews.llvm.org/D35967
llvm-svn: 309762
|
| |
|
|
|
|
|
|
|
| |
If SCEV can prove that the backedge taken count for a loop is zero, it does not
need to "understand" a recursive PHI to compute its exiting value.
This should fix PR33885.
llvm-svn: 309758
|
| |
|
|
|
|
| |
rdar://problem/33580047
llvm-svn: 309757
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the last half-dozen commits to LLVM I removed code that became dead
after removing the offset parameter from llvm.dbg.value gradually
proceeding from IR towards the backend. Before I can move on to
DwarfDebug and friends there is one last side-called offset I need to
remove: This patch modifies PrologEpilogInserter's use of the
DBG_VALUE's offset argument to use a DIExpression instead. Because the
PrologEpilogInserter runs at the Machine level I had to play a little
trick with a named llvm.dbg.mir node to get the DIExpressions to print
in MIR dumps (which print the llvm::Module followed by the
MachineFunction dump).
I also had to add rudimentary DwarfExpression support to CodeView and
as a side-effect also fixed a bug (CodeViewDebug::collectVariableInfo
was supposed to give up on variables with complex DIExpressions, but
would fail to do so for fragments, which are also modeled as
DIExpressions).
With this last holdover removed we will have only one canonical way of
representing offsets to debug locations which will simplify the code
in DwarfDebug (and future versions of CodeViewDebug once it starts
handling more complex expressions) and make it easier to reason about.
This patch is NFC-ish: All test case changes are for assembler
comments and the binary output does not change.
rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D36125
llvm-svn: 309751
|
| |
|
|
|
|
|
|
| |
next => not
Differential Revision: https://reviews.llvm.org/D36104
llvm-svn: 309748
|
| |
|
|
|
|
|
|
|
|
|
| |
The coverage tool needs to know which slice to look at when it's handed
a universal binary. Some projects need to look at aggregate coverage
reports for a variety of slices in different binaries: this patch adds
support for these kinds of projects to llvm-cov.
rdar://problem/33579007
llvm-svn: 309747
|
| |
|
|
|
|
| |
warnings; other minor fixes (NFC).
llvm-svn: 309746
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous attempt, which made do with a single offset in
computeCalleeSaveRegisterPairs, wasn't quite enough. The previous
attempt only worked as long as CombineSPBump == true (since the
offset would be adjusted later in fixupCalleeSaveRestoreStackOffset).
Instead include the size for the fixed stack area used for win64
varargs in calculations in emitPrologue/emitEpilogue. The stack
consists of mainly three parts;
- AFI->getLocalStackSize()
- AFI->getCalleeSavedStackSize()
- FixedObject
Most of the places in the code which previously used the CSStackSize
now use PrologueSaveSize instead, which is the sum of the latter
two, while some cases which need exactly the middle one use
AFI->getCalleeSavedStackSize() explicitly instead of a local variable.
In addition to moving the offsetting into emitPrologue/emitEpilogue
(which fixes functions with CombineSPBump == false), also set the
frame pointer to point to the right location, where the frame pointer
and link register actually are stored. In addition to the prologue/epilogue,
this also requires changes to resolveFrameIndexReference.
Add tests for a function that keeps a frame pointer and another one
that uses a VLA.
Differential Revision: https://reviews.llvm.org/D35919
llvm-svn: 309744
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The src0 register must match src1 or src2, but if these
were undefined they could end up using different implicit_defed
virtual registers. Force these to use one undef vreg or pick the
defined other register.
Also fixes producing invalid nodes without the right number of
inputs when src2 is undef.
llvm-svn: 309743
|
| |
|
|
| |
llvm-svn: 309740
|
| |
|
|
|
|
| |
IMHO this is a bit more readable.
llvm-svn: 309739
|
| |
|
|
|
|
|
|
|
| |
Includes a hack to fix the type selected for
the GlobalAddress of the function, which will be
fixed by changing the default datalayout to use
generic pointers for 0.
llvm-svn: 309732
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We already have information about static alloca stack locations in our
side table. Emitting instructions for them is inefficient, and it only
happens when the address of the alloca has been materialized within the
current block, which isn't often.
Reviewers: aprantl, probinson, dblaikie
Subscribers: jfb, dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits, aheejin
Differential Revision: https://reviews.llvm.org/D36117
llvm-svn: 309729
|
| |
|
|
| |
llvm-svn: 309726
|
| |
|
|
| |
llvm-svn: 309723
|
| |
|
|
|
|
| |
Add simple int immediate cost function.
llvm-svn: 309721
|
| |
|
|
| |
llvm-svn: 309719
|
| |
|
|
|
|
|
| |
This reverts commit r309680 which appears to be raising an assertion
in the test-suite.
llvm-svn: 309717
|
| |
|
|
|
|
| |
test/fuzzer-printcovpcs.test until this can be fixed on Windows
llvm-svn: 309716
|
| |
|
|
|
|
|
|
| |
Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431).
Merged SSE_VEC_BIT_ITINS_P + SSE_BIT_ITINS_P as we were interchanging between them.
llvm-svn: 309715
|
| |
|
|
|
|
|
|
| |
This only affects very small memcmp on x86 for now, but it
will become more important if we allow vector-sized load and
compares.
llvm-svn: 309711
|
| |
|
|
|
|
| |
Replace check with check that consuming store has the same type.
llvm-svn: 309708
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The 64-bit 'and' with immediate instruction only supports a 32-bit immediate. So for larger constants we have to load the constant into a register first. If the immediate happens to be a mask we can use the BEXTRI instruction to perform the masking. We already do something similar using the BZHI instruction from the BMI2 instruction set.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36129
llvm-svn: 309706
|
| |
|
|
|
|
|
|
| |
Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431).
Checked on Agner that these actually match the UNPACK schedules, but better to include a separate class
llvm-svn: 309701
|
| |
|
|
| |
llvm-svn: 309699
|
| |
|
|
|
|
| |
Move candidate check from later check to initial candidate check.
llvm-svn: 309698
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
FEntryInserter pass unconditionally derefs the first Instruction
in the first Basic Block. The pass crashes when the first
BasicBlock is empty. Fix the crash by not dereferencing the basic
Block iterator. This fixes an issue observed when building Linux kernel
4.4 with clang.
Fixes PR33971.
Reviewers: hfinkel, niravd, dblaikie
Reviewed By: niravd
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D35979
llvm-svn: 309694
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
even when BWI instructions are supported. Always use VMOVDQA32/VMOVDQU32.
We were already using the 32 bit element opcode if BWI isn't enabled, but there's no reason to change opcode if we have BWI. We will still use the 8/16 opcodes for masked stores though.
This allows us to use the aligned opcode when we can which makes our test output more consistent between different modes. It also reduces the number of isel patterns we need.
This is a slight inconsistency with loads which default to 64 bit element opcodes. I'll probably rectify that in a future patch.
Differential Revision: https://reviews.llvm.org/D35978
llvm-svn: 309693
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
assert
Summary:
As far as I can tell the earlier call getLimitedValue will guaranteed ShiftAmt is saturated to BitWidth-1 preventing it from ever being equal or greater than BitWidth.
At one point in the past the getLimitedValue call was only passed BitWidth not BitWidth - 1. This would have allowed the equality case to get here. And in fact this check was initially added as just BitWidth == ShiftAmt, but was changed shortly after to include > which should have never been possible.
Reviewers: spatel, majnemer, davide
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36123
llvm-svn: 309690
|
| |
|
|
|
|
| |
Post-commit review feedback from Paul Robinson on r309630. Thanks Paul!
llvm-svn: 309685
|
| |
|
|
| |
llvm-svn: 309683
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.
Reviewers: efriedma, RKSimon, spatel
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D35566
llvm-svn: 309680
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch enables control flow optimization for
variations of BBIT instruction. In this case
optimization removes unnecessary branch after
BBIT instruction.
Differential Revision: https://reviews.llvm.org/D35359
llvm-svn: 309679
|
| |
|
|
|
|
|
|
|
| |
Certain operations require vector of i1 values. However, for Hexagon
architecture compatibility, they need to be represented as vector of i8.
Patch by Suyog Sarda.
llvm-svn: 309677
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D35916
llvm-svn: 309675
|
| |
|
|
|
|
|
|
| |
given instr does not have sched model then we try to calculate the latecy/throughput with help of itineraries.
Differential Revision https://reviews.llvm.org/D35997
llvm-svn: 309666
|
| |
|
|
|
|
|
| |
One more assertion of this kind. It is a preparation step for generalizing
to the case of stride not equal to +1/-1.
llvm-svn: 309663
|
| |
|
|
|
|
|
| |
This came up as a point of confusion while working on a fundamental
problem with the combination of CGSCC iteration and the inliner.
llvm-svn: 309662
|
| |
|
|
|
|
|
| |
We should never return zero steps, ensure this fact by adding
a sanity check when we are analyzing the induction variable.
llvm-svn: 309661
|
| |
|
|
|
|
|
|
|
| |
condition value through case edges"
This causes assertion failures in (a somewhat old version of) SpiderMonkey.
I have already forwarded reproduction instructions to the patch author.
llvm-svn: 309659
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
To the best of my knowledge -metarenamer is used in two cases:
1) obfuscate names, when e.g. they contain informations that
can't be shared.
2) Improve clarity of the textual IR for testcases.
One of the usecases if getting the output of `opt` and passing it
to the lli interpreter to run the test. If metarenamer renames
@main, lli can't find an entry point.
llvm-svn: 309657
|