| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.
This affects the way that types and type sets are printed, and the
tests relying on that have been updated.
There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)
For more information, please refer to the review page.
Differential Revision: https://reviews.llvm.org/D31951
llvm-svn: 313271
|
| |
|
|
|
|
|
|
| |
Patch by Jesper Antonsson.
Differential Revision: https://reviews.llvm.org/D37611
llvm-svn: 313268
|
| |
|
|
|
|
|
|
|
|
| |
We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or.
Original patch by @tstellar
Differential Revision: https://reviews.llvm.org/D19325
llvm-svn: 313251
|
| |
|
|
|
|
|
|
|
|
| |
DAGCombine etc.
Use RotAmt.urem(VTBits) instead of AND(RotAmt, VTBits - 1)
TBH I don't expect non-power-of-2 types to be created, but it makes the logic clearer and matches what we do in other rotation combines.
llvm-svn: 313245
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the instrumentation map
Summary:
XRay had been assuming that the previous section is the "text" section
of the function when lowering the instrumentation map. Unfortunately
this is not a safe assumption, because we may be coming from lowering
debug type information for the function being lowered.
This fixes an issue with combining -gsplit-dwarf, -generate-type-units,
-debug-compile and -fxray-instrument for sole member functions. When the
split dwarf section is stripped, we're left with references from the
xray_instr_map to the debug section. The change now uses the function's
symbol instead of the previous section's start symbol.
We found the bug while attempting to strip the split debug sections off
an XRay-instrumented object file, which had a peculiar edge-case for
single-function classes where the single function is being lowered.
Because XRay had assocaited the instrumentation map for a function to
the debug types section instead of the function's section, the objcopy
call will fail due to the misplaced reference from the xray_instr_map
section.
Reviewers: pcc, dblaikie, echristo
Subscribers: llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D37791
llvm-svn: 313233
|
| |
|
|
| |
llvm-svn: 313214
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for large BBs."
This caused PR34596.
> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619
llvm-svn: 313213
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MachineScheduler when clustering loads or stores checks if base
pointers point to the same memory. This check is done through
comparison of base registers of two memory instructions. This
works fine when instructions have separate offset operand. If
they require a full calculated pointer such instructions can
never be clustered according to such logic.
Changed shouldClusterMemOps to accept base registers as well and
let it decide what to do about it.
Differential Revision: https://reviews.llvm.org/D37698
llvm-svn: 313208
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we used a size of '1' for VLAs because we weren't sure what
MSVC did. However, MSVC does support declaring an array without a size,
for which it emits an array type with a size of zero. Clang emits the
same DI metadata for VLAs and arrays without bound, so we would describe
arrays without bound as having one element. This lead to Microsoft
debuggers only printing a single element.
Emitting a size of zero appears to cause these debuggers to search the
symbol information to find a definition of the variable with accurate
array bounds.
Fixes http://crbug.com/763580
llvm-svn: 313203
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
HoistSpillHelper.
This is to fix PR34502. After rL311401, the live range of spilled vreg will be
cleared. HoistSpill need to use the live range of the original vreg before splitting
to know the moving range of the spills. The patch saves a copy of live interval for
the spilled vreg inside of HoistSpillHelper.
Differential Revision: https://reviews.llvm.org/D37578
llvm-svn: 313197
|
| |
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 313194
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
To improve CodeView quality for static member functions, we need to make the
static explicit. In addition to a small change in LLVM's CodeViewDebug to
return the appropriate MethodKind, this requires a small change in Clang to
note the staticness in the debug info metadata.
Subscribers: aprantl, hiraditya
Differential Revision: https://reviews.llvm.org/D37715
llvm-svn: 313192
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: It pollutes the global namespace otherwise.
Patch by: Bevin Hansson
Reviewers: jonpa
Reviewed By: jonpa
Subscribers: MatzeB, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D37555
llvm-svn: 313148
|
| |
|
|
|
|
|
|
|
| |
This flag is unnecessary for testing because we can get the coverage
we need by adjusting CU attributes.
Differential Revision: https://reviews.llvm.org/D37725
llvm-svn: 313079
|
| |
|
|
|
|
|
|
| |
This allows the flag to be persisted through to LTO.
Differential Revision: https://reviews.llvm.org/D37655
llvm-svn: 313078
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes
the analyzeBranch method which currently does not include any implicit operands.
This is not an issue on PPC but must be handled on other targets.
Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce.
Differential Revision : https: // reviews.llvm.org/D32776
llvm-svn: 313061
|
| |
|
|
|
|
|
|
|
|
|
| |
Looks like these were copied from the ELF sections but
don't apply to Wasm and were not used anywhere.
Also remove unused Wasm methods in MCContext.
Differential Revision: https://reviews.llvm.org/D37633
llvm-svn: 313058
|
| |
|
|
|
|
|
| |
This reverts commit r313047 as it is causing buildbot failure (lldb inline
stepping tests).
llvm-svn: 313057
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A prologue-end line record is emitted with an incorrect associated address,
which causes a debugger to show the beginning of function body to be inside
the prologue.
Patch written by Carlos Alberto Enciso.
Differential Revision: https://reviews.llvm.org/D37625
llvm-svn: 313047
|
| |
|
|
|
|
| |
warnings; other minor fixes (NFC).
llvm-svn: 312971
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
GEP merging can sometimes increase the number of live values and register
pressure across control edges and cause performance problems particularly if the
increased register pressure results in spills.
This change implements GEP unmerging around an IndirectBr in certain cases to
mitigate the issue. This is in the CodeGenPrepare pass (after all the GEP
merging has happened.)
With this patch, the Python interpreter loop runs faster by ~5%.
Reviewers: sanjoy, hfinkel
Reviewed By: hfinkel
Subscribers: eastig, junbuml, llvm-commits
Differential Revision: https://reviews.llvm.org/D36772
llvm-svn: 312930
|
| |
|
|
|
|
|
|
| |
getShiftAmountTy. NFCI
getShiftAmountTy already returns the vector type when called for vectors.
llvm-svn: 312924
|
| |
|
|
| |
llvm-svn: 312919
|
| |
|
|
|
|
|
|
|
| |
After the split of the Scatter operation, the order of the new instructions is well defined - Lo goes before Hi. Otherwise the semantic of Scatter (from LSB to MSB) is broken.
I'm chaining 2 nodes to prevent reordering.
Differential Revision https://reviews.llvm.org/D37670
llvm-svn: 312894
|
| |
|
|
| |
llvm-svn: 312852
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- Use range based for
- Variable names should start with upper case
- Add `const`
- Change class name to match filename
- Fix doxygen comments
- Use MCPhysReg instead of unsigned
- Use references instead of pointers where things cannot be nullptr
- Misc coding style improvements
llvm-svn: 312846
|
| |
|
|
| |
llvm-svn: 312845
|
| |
|
|
| |
llvm-svn: 312844
|
| |
|
|
|
|
|
|
|
|
|
| |
rL312641 Allowed llvm.memcpy/memset/memmove to be tail calls when parent
function return the intrinsics's first argument. However on arm-none-eabi
platform, llvm.memcpy will be expanded to __aeabi_memcpy which doesn't
have return value. The fix is to check the libcall name after expansion
to match "memcpy/memset/memmove" before allowing those intrinsic to be
tail calls.
llvm-svn: 312799
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37600
llvm-svn: 312797
|
| |
|
|
|
|
|
|
|
| |
by reusing more of the existing machinery
This is a follow-up to r312169.
Thanks to Björn Pettersson for the testcase!
llvm-svn: 312773
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes code-gen for XRay in PPC. The regression wasn't caught by
codegen tests which we add in this change.
What happened was the following:
- For tail exits, we used to unconditionally prepend the returns/exits
with a pseudo-instruction that gets lowered to the instrumentation
sled (and leave the actual return/exit instruction as-is).
- Changes to the XRay instrumentation pass caused the tail exits to
suddenly also emit the tail exit pseudo-instruction, since the check
for whether a return instruction was also a call instruction meant it
was a tail exit instruction.
- None of the tests caught the regression either due to non-existent
tests, or the tests being disabled/removed for continuous breakage.
This change re-introduces some of the basic tests and verifies that
we're back to a state that allows the back-end to generate appropriate
XRay instrumented binaries for PPC in the presence of tail exits.
Reviewers: echristo, timshen
Subscribers: nemanjai, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D37570
llvm-svn: 312772
|
| |
|
|
|
|
|
| |
Many of these uses can get by with forward declarations. Hopefully this
speeds up compilation after adding a single intrinsic.
llvm-svn: 312759
|
| |
|
|
|
|
|
|
|
|
| |
r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318
Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490
llvm-svn: 312758
|
| |
|
|
|
|
|
|
| |
It's meaningless and takes up extra space in the line table.
Differential Revision: https://reviews.llvm.org/D37364
llvm-svn: 312751
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes some combine issues for AMDGPU where we weren't
getting the many extract_vector_elt combines expected
in a future patch.
This should really be checking isOperationLegalOrCustom on
the extract. That improves a number of x86 lit tests, but
a few get stuck in an infinite loop from one place
where a similar looking extract is created. I have a
different workaround in the backend for that which
keeps many of those improvements, but also adds a few
regressions.
llvm-svn: 312730
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.
In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.
Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
Reviewed By: fhahn
Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D36619
llvm-svn: 312719
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This function is used in D36619 to update the instruction depths
incrementally.
Reviewers: efriedma, Gerolf, MatzeB, fhahn
Reviewed By: fhahn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36696
llvm-svn: 312714
|
| |
|
|
|
|
|
|
|
|
| |
removing them"
This temporarily reverts commit 463fa38 (r311401).
See https://bugs.llvm.org/show_bug.cgi?id=34502
llvm-svn: 312708
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.
To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:
- computeLiveIns() just computes the live-ins in a LivePhysRegs set.
- addLiveIns() applies the live-ins to a block live-in list.
- computeAndAddLiveIns() is a convenience function combining the other
two functions and behaving like computeLiveIns() before this patch.
Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>
Differential Revision: https://reviews.llvm.org/D37034
llvm-svn: 312668
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When if-converting a diamond, two separate blocks will be placed back
to back to form a straight line code. To ensure correctness of the
liveness information, any registers that are live in the second block
should not be killed in the first block, even if they were in the
original code.
Additionally, when the two blocks share common instructions at the
beginning, these instructions will not be duplicated, but only placed
once, before both of the blocks. Since the function "isIdenticalTo"
(as used here) ignores kill flags, the common initial code in one
block may have a kill flag for a register that is live in the other
block.
Because the code that removes kill flags only runs for the non-common
parts of the predicated blocks, a kill flag mismatch in the common
code could still lead to a live register being killed prematurely.
llvm-svn: 312654
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
function return the intrinsics's first argument.
llvm.memcpy/memset/memmove return void but they will return the first
argument after they are expanded as libcalls. Now if the parent function
has any return value, llvm.memcpy cannot be turned into tail call after
expansion.
The patch is to handle that case in SelectionDAGBuilder so when caller
function return the same value as the first argument of llvm.memcpy,
tail call is allowed.
Differential Revision: https://reviews.llvm.org/D37406
llvm-svn: 312641
|
| |
|
|
|
|
| |
we don't create a BUILD_VECTOR with an illegal type after type legalization.
llvm-svn: 312621
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
S_UDT records are basically the "bridge" between the debugger's
expression evaluator and the type information. If you type
(Foo*)nullptr into the watch window, the debugger looks for an
S_UDT record named Foo. If it can find one, it displays your type.
Otherwise you get an error.
We have always understood this to mean that if you have code like
this:
struct A {
int X;
};
struct B {
typedef A AT;
AT Member;
};
that you will get 3 S_UDT records. "A", "B", and "B::AT". Because
if you were to type (B::AT*)nullptr into the debugger, it would
need to find an S_UDT record named "B::AT".
But "B::AT" is actually the S_UDT record that would be generated
if B were a namespace, not a struct. So the debugger needs to be
able to distinguish this case. So what it does is:
1. Look for an S_UDT named "B::AT". If it finds one, it knows
that AT is in a namespace.
2. If it doesn't find one, split at the scope resolution operator,
and look for an S_UDT named B. If it finds one, look up the type
for B, and then look for AT as one of its members.
With this algorithm, S_UDT records for nested typedefs are not just
unnecessary, but actually wrong!
The results of implementing this in clang are dramatic. It cuts
our /DEBUG:FASTLINK PDB sizes by more than 50%, and we go from
being ~20% larger than MSVC PDBs on average, to ~40% smaller.
It also slightly speeds up link time. We get about 10% faster
links than without this patch.
Differential Revision: https://reviews.llvm.org/D37410
llvm-svn: 312583
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This intrinsic represents a label with a list of associated metadata
strings. It is modelled as reading and writing inaccessible memory so
that it won't be removed as dead code. I think the intention is that the
annotation strings should appear at most once in the debug info, so I
marked it noduplicate. We are allowed to inline code with annotations as
long as we strip the annotation, but that can be done later.
Reviewers: majnemer
Subscribers: eraman, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D36904
llvm-svn: 312569
|
| |
|
|
|
|
|
|
|
|
| |
forwarding""
This crashes on boringSSL on PPC (will send reduced testcase)
This reverts commit r312328.
llvm-svn: 312490
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
references in .text
Summary:
This is a re-roll of D36615 which uses PLT relocations in the back-end
to the call to __xray_CustomEvent() when building in -fPIC and
-fxray-instrument mode.
Reviewers: pcc, djasper, bkramer
Subscribers: sdardis, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D37373
llvm-svn: 312466
|
| |
|
|
|
|
|
|
|
|
| |
The function combineShuffleToVectorExtend in DAGCombine might generate an illegal typed node after "legalize types" phase, causing assertion on non-simple type to fail afterwards.
Adding a type check in case the combine is running after the type legalize pass.
Differential Revision: https://reviews.llvm.org/D37330
llvm-svn: 312438
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If getHexUint reads in a hex 0, it will create an APInt with a value of 0.
The number of active bits on this APInt is used to calculate the bitwidth of
Result. The number of active bits is defined as an APInt's bitwidth - its
number of leading 0s. Since this APInt is 0, its bitwidth and number of leading
0s are equal.
Thus, Result is constructed with a bitwidth of 0, triggering an APInt assert.
This commit fixes that by checking if the APInt is equal to 0, and setting the
bitwidth to 32 if it is. Otherwise, it sets the bitwidth using getActiveBits.
This caused issues when compiling MIR files with successor probabilities. In
the case that a successor is tagged with a probability of 0, this assert would
fire on debug builds.
https://reviews.llvm.org/D37401
llvm-svn: 312387
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A register in CodeGen can be marked as reserved: In that case we
consider the register always live and do not use (or rather ignore)
kill/dead/undef operand flags.
LiveIntervalAnalysis however tracks liveness per register unit (not per
register). We already needed adjustments for this in r292871 to deal
with super/sub registers. However I did not look at aliased register
there. Looking at ARM:
FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV
(regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit
(FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers.
This shared register unit was previously considered non-reserved,
however given that we uses of the reserved FPSCR potentially violate
some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV
reserved too and stop tracking liveness for it.
This patch:
- Defines a register unit as reserved when: At least for one root
register, the root register and all its super registers are reserved.
- Adjust LiveIntervals::computeRegUnitRange() for new reserved
definition.
- Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way
of testing.
- Stop computing LiveRanges for reserved register units in HMEditor even
with UpdateFlags enabled.
- Skip verification of uses of reserved reg units in the machine
verifier (this usually didn't happen because there would be no cached
liverange but there is no guarantee for that and I would run into this
case before the HMEditor tweak, so may as well fix the verifier too).
Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today;
aliased registers are rarely used, the only other cases are hexagons
P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved
registers in an alias.
Differential Revision: https://reviews.llvm.org/D37356
llvm-svn: 312348
|