| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
| |
getOrCompHotCountThreshold/getOrCompColdCountThreshold introduced in
https://reviews.llvm.org/D45377 contain a bad mistake and will only return 1 or 0
instead of the true hot/cold cutoff value. The patch fixes the mistake. But the
mistake seems not causing big performance difference according to internal server
benchmarks testing.
Differential Revision: https://reviews.llvm.org/D50370
llvm-svn: 339162
|
| |
|
|
|
|
|
|
|
|
| |
serial chain dependency.
Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code.
Differential Revision: https://reviews.llvm.org/D50374
llvm-svn: 339157
|
| |
|
|
|
|
|
|
| |
nonvolatile_store/nonvolatile_load pattern fragment in TargetSelectionDAG.td
Differential Revision: https://reviews.llvm.org/D50358
llvm-svn: 339156
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In combineMetadata, we should be able to preserve K's nonnull metadata,
if K does not move. This condition should hold for all replacements by
NewGVN/GVN, but I added a bunch of assertions to verify that.
Fixes PR35038.
There probably are additional kinds of metadata that could be preserved
using similar reasoning. This is follow-up work.
Reviewers: dberlin, davide, efriedma, nlopes
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D47339
llvm-svn: 339149
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50236
llvm-svn: 339148
|
| |
|
|
|
|
|
|
| |
This was missed in D50185.
NFC until we add actual non-uniform support to BuildSDIV (similar BuildUDIV support in D49248) - for now it just early outs.
llvm-svn: 339147
|
| |
|
|
|
|
| |
This was missed in D49248
llvm-svn: 339146
|
| |
|
|
| |
llvm-svn: 339144
|
| |
|
|
|
|
|
| |
Update the comment in nextGroup since the ProcResourceCounters are not anymore
always decremented with '1'.
llvm-svn: 339140
|
| |
|
|
|
|
|
|
| |
Remove the redundant check against zero when updating ProcResourceCounters in
nextGroup(), as pointed out in https://reviews.llvm.org/D50187.
Review: Ulrich Weigand.
llvm-svn: 339139
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This function is shared between both implementations. I am not sure if
Utils/Local.h is the best place though.
Reviewers: davide, dberlin, efriedma, xbolva00
Reviewed By: efriedma, xbolva00
Differential Revision: https://reviews.llvm.org/D47337
llvm-svn: 339138
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes an inconsistency in code generation when compiling with or
without debug information (-g). When debug information is available in
an empty block, the original test would fail, resulting in possibly
different code.
Patch by: Jeroen Dobbelaere
Differential revision: https://reviews.llvm.org/D49467
llvm-svn: 339129
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When potential jump instruction and target are in the same segment, use
jump instruction with immediate field.
In cases where offset does not fit immediate value of a bc/j instructions,
offset is stored into register, and then jump register instruction is used.
Differential Revision: https://reviews.llvm.org/D48019
llvm-svn: 339126
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The accelerator tables use the debug_str section to store their strings.
However, they do not support the indirect method of access that is
available for the debug_info section (DW_FORM_strx et al.).
Currently our code is assuming that all strings can/will be referenced
indirectly, and puts all of them into the debug_str_offsets section.
This is generally true for regular (unsplit) dwarf, but in the DWO case,
most of the strings in the debug_str section will only be used from the
accelerator tables. Therefore the contents of the debug_str_offsets
section will be largely unused and bloating the main executable.
This patch rectifies this by teaching the DwarfStringPool to
differentiate between strings accessed directly and indirectly. When a
user inserts a string into the pool it has to declare whether that
string will be referenced directly or not. If at least one user requsts
indirect access, that string will be assigned an index ID and put into
debug_str_offsets table. Otherwise, the offset table is skipped.
This approach reduces the overall binary size (when compiled with
-gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is
shrunk by 99%).
Reviewers: probinson, dblaikie, JDevlieghere
Subscribers: aprantl, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D49493
llvm-svn: 339122
|
| |
|
|
|
|
|
|
|
|
| |
This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators.
It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV.
Differential Revision: https://reviews.llvm.org/D49248
llvm-svn: 339121
|
| |
|
|
|
|
|
|
|
|
| |
I was trying to add a test case for LLD and found that it
is impossible to set sh_entsize via yaml.
The patch implements the missing part.
Differential revision: https://reviews.llvm.org/D50235
llvm-svn: 339113
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is necessary to add a VI specific builtin,
__builtin_amdgcn_s_dcache_wb. We already have an
overly specific feature for one of these builtins,
for s_memrealtime. I'm not sure whether it's better
to add more of those, or to get rid of that and merge
it with vi-insts.
Alternatively, maybe this logically goes with scalar-stores?
llvm-svn: 339104
|
| |
|
|
|
|
| |
Src0 doesn't really convey any meaning to what the operand is. Passthru matches what's used in the documentation for the intrinsic this comes from.
llvm-svn: 339101
|
| |
|
|
|
|
| |
getValue is more meaningful name for scatter than it is for gather. Split them and use getPassThru for gather.
llvm-svn: 339096
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change uses a single offset pointer used throughout the
implementation of the individual record parsers. This allows us to
report where in a trace file parsing failed.
We're still in an intermediate step here as we prepare to refactor this
further into a set of types and use object-oriented design principles
for a cleaner implementation. The next steps will be to allow us to
parse/dump files in a streaming fashion and incrementally build up the
structures in memory instead of the current all-or-nothing approach.
Reviewers: kpw, eizan
Reviewed By: kpw
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D50169
llvm-svn: 339092
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Logic for tracking implicit control flow instructions was added to GVN to
perform PRE optimizations correctly. It appears that GVN is not the only
optimization that sometimes does PRE, so this logic is required in other
places (such as Jump Threading).
This is an NFC patch that encapsulates all ICF-related logic in a dedicated
utility class separated from GVN.
Differential Revision: https://reviews.llvm.org/D40293
llvm-svn: 339086
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Wasm does not have direct counterparts to some of LLVM IR's atomicrmw
instructions (min, max, umin, umax, and nand). This enables atomic
expansion using cmpxchg instruction within a loop for those atomicrmw
instructions.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D49440
llvm-svn: 339084
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The spec only defines a SIMD expression type of V128 and
leaves interpretation of different vector types to the instructions.
Differential Revision: https://reviews.llvm.org/D50367
Patch by Thomas Lively
llvm-svn: 339082
|
| |
|
|
| |
llvm-svn: 339078
|
| |
|
|
| |
llvm-svn: 339077
|
| |
|
|
|
|
|
| |
This usually avoids some re-packing code, and may
help find canonical sources.
llvm-svn: 339072
|
| |
|
|
|
|
| |
This will make more complex combines easier.
llvm-svn: 339070
|
| |
|
|
| |
llvm-svn: 339069
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
section symbol.
This matches our behaviour for regular (i.e. relocated) references to
private symbols and therefore avoids needing to unnecessarily write
address-significant .L symbols to the object file's symbol table,
which can interfere with stack traces.
Fixes check-cfi after r339050.
llvm-svn: 339066
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Everything should quiet, and I think everything should
flush.
I assume the min3/med3/max3 follow the same rules
as regular min/max for flushing, which should at
least be conservatively correct.
There are still more operations that need to
be handled.
llvm-svn: 339065
|
| |
|
|
|
|
|
|
|
| |
Not sure why this was checking for denormals for f16.
My interpretation of the IEEE standard is conversions
should produce a canonical result, and the ISA manual
says denormals are created when appropriate.
llvm-svn: 339064
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If denormals are enabled, denormals are canonical.
Also fix a few other issues. minnum/maxnum are supposed
to canonicalize. Temporarily improve workaround for the
instruction behavior change in gfx9.
Handle selects and fcopysign.
The tests were also largely broken, since they were
checking for a flush used on some targets after the
store of the result.
llvm-svn: 339061
|
| |
|
|
| |
llvm-svn: 339059
|
| |
|
|
|
|
|
|
|
|
|
| |
This assert fires when attempting to extract a subregister from the
global PIC base register. This virtual register SD node is not in the
VRBaseMap, so we shouldn't call getVR to look it up there. If this is a
RegisterSDNode, we should be able to use the virtual register directly.
Fixes PR38385
llvm-svn: 339056
|
| |
|
|
|
|
|
|
|
| |
Properly shrink `pow()` to `powf()` as a binary function and, when no other
simplification applies, do not discard it.
Differential revision: https://reviews.llvm.org/D50113
llvm-svn: 339046
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50258
llvm-svn: 339045
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expand isFNEG so that we generate the appropriate F(N)M(ADD|SUB)
instructions in more cases. For example, the following sequence
a = _mm256_broadcast_ss(f)
d = _mm256_fnmadd_ps(a, b, c)
generates an fsub and fma without this patch and an fnma with this
change.
Reviewers: craig.topper
Subscribers: llvm-commits, davidxl, wmi
Differential Revision: https://reviews.llvm.org/D48467
llvm-svn: 339043
|
| |
|
|
|
|
|
|
|
|
| |
sure the store isn't volatile
If the store is volatile this might be a memory mapped IO access. In that case we shouldn't generate a load that didn't exist in the source
Differential Revision: https://reviews.llvm.org/D50270
llvm-svn: 339041
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for all the uses from the same def is done.
We run into a compile time problem with flex generated code combined with
`-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants
outside of a big loop, and drastically increases the compile time in global
register splitting and copy coalescing. https://reviews.llvm.org/D49353
relieves the problem in global splitting. This patch is to handle the problem
in copy coalescing.
About the situation where the problem in copy coalescing happens. After
machineLICM, we have several defs outside of a big loop with hundreds or
thousands of uses inside the loop. Rematerialization in copy coalescing
happens for each use and everytime rematerialization is done, shrinkToUses
will be called to update the huge live interval. Because we have 'n' uses
for a def, and each live interval update will have at least 'n' complexity,
the total update work is n^2.
To fix the problem, we try to do the live interval update work in a collective
way. If a def has many copylike uses larger than a threshold, each time
rematerialization is done for one of those uses, we won't do the live interval
update in time but delay that work until rematerialization for all those uses
are completed, so we only have to do the live interval update work once.
Delaying the live interval update could potentially change the copy coalescing
result, so we hope to limit that change to those defs with many
(like above a hundred) copylike uses, and the cutoff can be adjusted by the
option -mllvm -late-remat-update-threshold=xxx.
Differential Revision: https://reviews.llvm.org/D49519
llvm-svn: 339035
|
| |
|
|
|
|
|
|
|
|
| |
On windows when raw_fd_ostream::write_impl calls write, a 32 bit input is required for character count. As a variable with size_t is used for this argument, on x64 integral demotion occurs. In the case of large files an infinite loop follows.
See: https://bugs.llvm.org/show_bug.cgi?id=37926
This fix allows the output of files larger than the previous int32 limit.
Differential Revision: https://reviews.llvm.org/D48948
llvm-svn: 339027
|
| |
|
|
|
|
| |
Appears from expansion of some packed cases.
llvm-svn: 339025
|
| |
|
|
|
|
|
| |
Also fix apparently missing test coverage for any of the
handling here.
llvm-svn: 339023
|
| |
|
|
| |
llvm-svn: 339021
|
| |
|
|
| |
llvm-svn: 339020
|
| |
|
|
| |
llvm-svn: 339019
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the
same type. This fixes an assertion failure in VerifySDNode.
Reviewers: SjoerdMeijer, t.p.northover, javed.absar
Reviewed By: SjoerdMeijer
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D50202
llvm-svn: 339013
|
| |
|
|
|
|
|
|
|
|
| |
Currently we use #pragma push_macro(LLVM_DEBUG) to fiddle with the LLVM_DEBUG
macro so that we can silence debugging the Knuth division algorithm unless it's
actually desired. Unfortunately this is incompatible with enabling modules
while building LLVM (via LLVM_ENABLE_MODULES=ON), probably due to a bug being
fixed by D33004.
llvm-svn: 339009
|
| |
|
|
|
|
|
|
|
| |
ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes
out that part of any addend in an object file. But it only does that for
symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting
that extra bit in other cases.
llvm-svn: 339007
|
| |
|
|
|
|
|
|
|
|
|
| |
AND"
The patch was reverted because of bug detected by sanitizer. The bug is fixed,
respective tests added.
Differential Revision: https://reviews.llvm.org/D50172
llvm-svn: 339005
|
| |
|
|
|
|
|
|
|
| |
Multiple failues reported by sanitizer-x86_64-linux, seem to be caused by this
patch. Reverting to see if they sustain without it.
Differential Revision: https://reviews.llvm.org/D50172
llvm-svn: 338994
|