| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
| |
A test case that crached is added to avx512-trunc.ll.
(PR31589)
llvm-svn: 292479
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.
Changes since first commit attempt:
* Added missing guards
* Added more missing guards
* Found and fixed a use-after-free bug involving Twine locals
Reviewers: t.p.northover, ab, rovka, qcolombet
Reviewed By: qcolombet
Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka
Differential Revision: https://reviews.llvm.org/D27338
llvm-svn: 292478
|
| |
|
|
| |
llvm-svn: 292476
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently we expand and scalarize these operations, but I think we should be able to implement ADD/SUB with KXOR and MUL with KAND.
We already do this for scalar i1 operations so I just extended it to vectors of i1.
Reviewers: zvi, delena
Reviewed By: delena
Subscribers: guyblank, llvm-commits
Differential Revision: https://reviews.llvm.org/D28888
llvm-svn: 292474
|
| |
|
|
|
|
|
|
|
|
|
|
| |
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.
fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.
llvm-svn: 292473
|
| |
|
|
|
|
|
|
| |
They seem to produce nonsense results when used.
This should be applied to the release branch.
llvm-svn: 292472
|
| |
|
|
|
|
| |
identical.
llvm-svn: 292469
|
| |
|
|
|
|
| |
subvector broadcasts that can't fold the load.
llvm-svn: 292466
|
| |
|
|
|
|
|
|
| |
LV no longer "requires" LCSSA and LoopSimplify, and instead forms
them internally as required. So, there's nothing preventing it from
being enabled.
llvm-svn: 292464
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Type identifiers are exported by:
- Adding coarse-grained information about how to test the type
identifier to the summary.
- Creating symbols in the object file (aliases and absolute symbols)
containing fine-grained information about the type identifier.
Differential Revision: https://reviews.llvm.org/D28424
llvm-svn: 292462
|
| |
|
|
| |
llvm-svn: 292461
|
| |
|
|
| |
llvm-svn: 292460
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
collection
Summary:
SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info:
* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)
This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):
-gmlt(orig) -gmlt(patched) -g
433.milc 4.68% 5.40% 19.73%
444.namd 8.45% 8.93% 45.99%
447.dealII 97.43% 115.21% 374.89%
450.soplex 27.75% 31.88% 126.04%
453.povray 21.81% 26.16% 92.03%
470.lbm 0.60% 0.67% 1.96%
482.sphinx3 5.77% 6.47% 26.17%
400.perlbench 17.81% 19.43% 73.08%
401.bzip2 3.73% 3.92% 12.18%
403.gcc 31.75% 34.48% 122.75%
429.mcf 0.78% 0.88% 3.89%
445.gobmk 6.08% 7.92% 42.27%
456.hmmer 10.36% 11.25% 35.23%
458.sjeng 5.08% 5.42% 14.36%
462.libquantum 1.71% 1.96% 6.36%
464.h264ref 15.61% 16.56% 43.92%
471.omnetpp 11.93% 15.84% 60.09%
473.astar 3.11% 3.69% 14.18%
483.xalancbmk 56.29% 81.63% 353.22%
geomean 15.60% 18.30% 57.81%
Debug info size change for -gmlt binary with this patch:
433.milc 13.46%
444.namd 5.35%
447.dealII 18.21%
450.soplex 14.68%
453.povray 19.65%
470.lbm 6.03%
482.sphinx3 11.21%
400.perlbench 8.91%
401.bzip2 4.41%
403.gcc 8.56%
429.mcf 8.24%
445.gobmk 29.47%
456.hmmer 8.19%
458.sjeng 6.05%
462.libquantum 11.23%
464.h264ref 5.93%
471.omnetpp 31.89%
473.astar 16.20%
483.xalancbmk 44.62%
geomean 16.83%
Reviewers: davidxl, echristo, dblaikie
Reviewed By: echristo, dblaikie
Subscribers: aprantl, probinson, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25434
llvm-svn: 292457
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This changes the vectorizer to explicitly use the loopsimplify and lcssa utils,
instead of "requiring" the transformations as if they were analyses.
This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA
for all loops, but rather only for the loops we expect to modify.
Differential Revision: https://reviews.llvm.org/D28868
llvm-svn: 292456
|
| |
|
|
|
|
|
|
| |
- Fix doxygen comments: Do not repeat name, remove duplicated doxygen
comment (on declaration + implementation), etc.
- Use more range based for
llvm-svn: 292455
|
| |
|
|
|
|
|
|
|
| |
There's no neg.f16 instruction, so negation has to
be done via subtraction from zero.
Differential Revision: https://reviews.llvm.org/D28876
llvm-svn: 292452
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.
Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.
No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.
Differential Revision: https://reviews.llvm.org/D28587
llvm-svn: 292449
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly straightforward changes; we just didn't do the computation before.
One sort of interesting change in LoopUnroll.cpp: we weren't handling
dominance for children of the loop latch correctly, but
foldBlockIntoPredecessor hid the problem for complete unrolling.
Currently punting on loop peeling; made some minor changes to isolate
that problem to LoopUnrollPeel.cpp.
Adds a flag -unroll-verify-domtree; it verifies the domtree immediately
after we finish updating it. This is on by default for +Asserts builds.
Differential Revision: https://reviews.llvm.org/D28073
llvm-svn: 292447
|
| |
|
|
| |
llvm-svn: 292446
|
| |
|
|
| |
llvm-svn: 292445
|
| |
|
|
|
|
|
|
|
| |
r291670 doesn't crash on the original testcase from PR31589,
but it crashes on a slightly more complex one.
PR31589 has the new reproducer.
llvm-svn: 292444
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when nothing is to be printed
Before, it would print a sequence of:
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
...
for every single function in the module.
llvm-svn: 292442
|
| |
|
|
|
|
| |
explicit; NFCI
llvm-svn: 292440
|
| |
|
|
|
|
|
|
| |
encode => endcode.
Differential Revision: https://reviews.llvm.org/D28866
llvm-svn: 292438
|
| |
|
|
|
|
|
| |
I missed deleting this check when I refactored this chunk in:
https://reviews.llvm.org/rL292260
llvm-svn: 292433
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28839
llvm-svn: 292431
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28838
llvm-svn: 292430
|
| |
|
|
| |
llvm-svn: 292425
|
| |
|
|
|
|
|
|
|
|
|
| |
We currently check whether a reduction has a single outside user. We don't
really need to require that - we just need to make sure a single value is
used externally. The number of external users of that value shouldn't actually
matter.
Differential Revision: https://reviews.llvm.org/D28830
llvm-svn: 292424
|
| |
|
|
|
|
|
|
|
|
|
| |
ARM seems to prefer that long literals be formed from their little end in
order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on
Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section
4.14).
Differential revision: https://reviews.llvm.org/D28697
llvm-svn: 292422
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28842
llvm-svn: 292421
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without this, we're stressing the RAUW of unique nodes,
which is a costly operation. This is intended to limit
the number of RAUW, and is very effective on the total
link-time of opt with ThinLTO, before:
real 4m4.587s user 15m3.401s sys 0m23.616s
after:
real 3m25.261s user 12m22.132s sys 0m24.152s
Reviewers: tejohnson, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28751
llvm-svn: 292420
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Limit register coalescer by not allowing it to artificially increase
size of registers beyond dword. Such super-registers are in fact
register sequences and not distinct HW registers.
With more super-regs we would need to allocate adjacent registers
and constraint regalloc more than needed. Moreover, our super
registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2,
VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers
allocation even more, resulting in excessive spilling.
Differential Revision: https://reviews.llvm.org/D28782
llvm-svn: 292413
|
| |
|
|
|
|
|
| |
Legalize stores of types that are too wide by breaking them up into
sequences of smaller stores.
llvm-svn: 292412
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to
available_externally when possible. If they had an initializer in the
global_ctors list, a comdat group was being created. This code
already had logic to skip available_externally defs, but now the
EliminateAvailableExternally pass will drop these symbols to
declarations earlier. Change the check to skip all declarations for
linker (which includes available_externally along with declarations).
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28737
llvm-svn: 292408
|
| |
|
|
| |
llvm-svn: 292407
|
| |
|
|
| |
llvm-svn: 292404
|
| |
|
|
|
|
|
|
|
|
|
| |
An ELFObjectFile can now create SubtargetFeatures from the available
ARM build attributes, in a similar manner to MIPS. I've moved the
MIPS code into its own function and the ARM handler also has a
separate function.
Differential Revision: https://reviews.llvm.org/D28291
llvm-svn: 292403
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This makes the file descriptors on unix platform non-inheritable (O_CLOEXEC).
There is no change in behavior on windows, as the handles were already
non-inheritable there.
Reviewers: rnk, rafael
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D28854
llvm-svn: 292401
|
| |
|
|
|
|
|
|
|
|
|
| |
A 64-bit relocation does not exist in 32-bit ARMELF. Report an error
instead of crashing.
PR23870
Patch by Sanne Wouda (sanwou01).
Differential Revision: https://reviews.llvm.org/D28851
llvm-svn: 292373
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In this function, virtual registers can be introduced (for example
through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs
will replace those virtual registers with concrete registers later on
in PrologEpilogInserter, which sets NoVRegs again.
This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which
failed with expensive checks.
https://llvm.org/bugs/show_bug.cgi?id=27484
Reviewers: rnk, bkramer, olista01
Reviewed By: olista01
Subscribers: llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D28829
llvm-svn: 292372
|
| |
|
|
|
|
| |
Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded.
llvm-svn: 292371
|
| |
|
|
|
|
|
| |
More missing guards. My build didn't notice it due to a stale file left over
from a Global ISel build.
llvm-svn: 292369
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.
Changes since last commit:
The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and
this should fix the buildbots however it may not be the whole fix. The previous
buildbot failures suggest there may be a memory bug lurking that I'm unable to
reproduce (including when using asan) or spot in the source. If they re-occur
on this commit then I'll need assistance from the bot owners to track it down.
Reviewers: t.p.northover, ab, rovka, qcolombet
Reviewed By: qcolombet
Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka
Differential Revision: https://reviews.llvm.org/D27338
llvm-svn: 292367
|
| |
|
|
|
|
|
|
|
|
|
| |
Enable an ELFObjectFile to read the its arm build attributes to
produce a target triple with a specific ARM architecture.
llvm-objdump now uses this functionality to automatically produce
a more accurate target.
Differential Revision: https://reviews.llvm.org/D28769
llvm-svn: 292366
|
| |
|
|
|
|
| |
As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need.
llvm-svn: 292365
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch improves the mul instruction combine function (combineMul)
by adding new layer of logic.
In this patch, we are adding the ability to fold (mul x, -((1 << c) -1))
or (mul x, -((1 << c) +1)) into (neg(X << c) -x) or (neg((x << c) + x) respective.
Differential Revision: https://reviews.llvm.org/D28232
llvm-svn: 292358
|
| |
|
|
|
|
|
|
|
|
|
| |
identify such problem earlier"
This reverts commit r292210, as it broke the Thumb buldbot with:
clang-5.0: error: the clang compiler does not support '-fxray-instrument
on thumbv7-unknown-linux-gnueabihf'.
llvm-svn: 292357
|
| |
|
|
|
|
|
|
| |
During post-RA pseudo expansion, an 'undef' flag of the source operand should
be propagated by emitGRX32Move().
Review: Ulrich Weigand
llvm-svn: 292353
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch fixes bugzilla 31576 (https://llvm.org/bugs/show_bug.cgi?id=31576).
"data32" instruction prefix was not defined in the llvm.
An exception had to be added to the X86 tablegen and AsmPrinter because both "data16" and "data32" are encoded to 0x66 (but in different modes).
Differential Revision: https://reviews.llvm.org/D28468
llvm-svn: 292352
|