| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351648
|
|
|
|
|
|
|
|
|
|
|
|
| |
Followup to D55745, this time handling comparisons with ugt and ult
predicates (which are the canonical forms for non-equality predicates).
For ctlz we can convert into a simple icmp, for cttz we can convert
into a mask check.
Differential Revision: https://reviews.llvm.org/D56355
llvm-svn: 351645
|
|
|
|
| |
llvm-svn: 351641
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
|
|
|
|
|
| |
a stray single '\r' from one file. These are the last line ending issues
I can find in the files containing parts of LLVM's file headers.
llvm-svn: 351634
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This modification of the currently unused inter-procedural constant
propagation pass (IPConstantPropagation) shows how abstract call sites
enable optimization of callback calls alongside direct and indirect
calls. Through minimal changes, mostly dealing with the partial mapping
of callbacks, inter-procedural constant propagation was enabled for
callbacks, e.g., OpenMP runtime calls or pthreads_create.
Differential Revision: https://reviews.llvm.org/D56447
llvm-svn: 351628
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An abstract call site is a wrapper that allows to treat direct,
indirect, and callback calls the same. If an abstract call site
represents a direct or indirect call site it behaves like a stripped
down version of a normal call site object. The abstract call site can
also represent a callback call, thus the fact that the initially
called function (=broker) may invoke a third one (=callback callee).
In this case, the abstract call side hides the middle man, hence the
broker function. The result is a representation of the callback call,
inside the broker, but in the context of the original instruction that
invoked the broker.
Again, there are up to three functions involved when we talk about
callback call sites. The caller (1), which invokes the broker
function. The broker function (2), that may or may not invoke the
callback callee. And finally the callback callee (3), which is the
target of the callback call.
The abstract call site will handle the mapping from parameters to
arguments depending on the semantic of the broker function. However,
it is important to note that the mapping is often partial. Thus, some
arguments of the call/invoke instruction are mapped to parameters of
the callee while others are not. At the same time, arguments of the
callback callee might be unknown, thus "null" if queried.
This patch introduces also !callback metadata which describe how a
callback broker maps from parameters to arguments. This metadata is
directly created by clang for known broker functions, provided through
source code attributes by the user, or later deduced by analyses.
For motivation and additional information please see the corresponding
talk (slides/video)
https://llvm.org/devmtg/2018-10/talk-abstracts.html#talk20
as well as the LCPC paper
http://compilers.cs.uni-saarland.de/people/doerfert/par_opt_lcpc18.pdf
Differential Revision: https://reviews.llvm.org/D54498
llvm-svn: 351627
|
|
|
|
|
|
| |
Original commit: r351582
llvm-svn: 351626
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Nikita Popov for pointing out this missed case.
This is a follow-up to r351411, which disabled function merging for
vararg functions outright due to a miscompile (see llvm.org/PR40345).
Differential Revision: https://reviews.llvm.org/D56865
llvm-svn: 351624
|
|
|
|
|
|
|
|
|
|
| |
If an inherently cold function is found, mark it as cold. For now this
means applying the `cold` and `minsize` attributes.
As a drive-by, revisit and clean up the criteria for considering a
function for splitting. Add tests.
llvm-svn: 351623
|
|
|
|
|
|
|
| |
Use the begin/end iterator idiom to avoid visiting split functions,
instead of doing a set lookup.
llvm-svn: 351622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CodeExtractor permits extracting a region of blocks from a function even
when values defined within the region are used outside of it.
This is typically done by creating an alloca in the original function
and reloading the alloca after a call to the extracted function.
Wrap the reload in lifetime start/end markers to promote stack coloring.
Suggested by Sergei Kachkov!
Differential Revision: https://reviews.llvm.org/D56045
llvm-svn: 351621
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r351618.
Compiler RT + ASAN tests are failing for PowerPC. Not sure
how would I reproduce these on macOS, so reverting (again)
until I do.
llvm-svn: 351619
|
|
|
|
|
|
| |
Original commit: r351582
llvm-svn: 351618
|
|
|
|
|
|
| |
This new assertion triggered on the AArch64 GlobalISel bots. Reverting while it's being investigated.
llvm-svn: 351617
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Right now we include ${TGT}GenCallingConv.inc once per each instruction
selection method implemented by ${TGT}:
- ${TGT}ISelLowering.cpp
- ${TGT}CallLowering.cpp
- ${TGT}FastISel.cpp
Instead, add a mechanism to tablegen for marking a particular convention
as "External", which causes tablegen to emit into the ::llvm namespace,
instead of as a static helper. This allows us to provide a header to
forward declare it, so we can simply call the function from all the
places it is referenced. Typically the calling convention analyzer is
called indirectly, so it doesn't benefit from inlining.
This saves a bit of final binary size, but mostly just saves object file
size:
before after diff artifact
12852K 12492K -360K X86ISelLowering.cpp.obj
4640K 4280K -360K X86FastISel.cpp.obj
1704K 2092K +388K X86CallingConv.cpp.obj
52448K 52336K -112K llc.exe
I didn't collect before numbers for X86CallLowering.cpp.obj, which is
for GlobalISel, but we should save 360K there as well.
This patch applies the strategy to the X86 backend, but there is no
reason it couldn't be applied to the other backends that implement
multiple ISel strategies, like AArch64.
Reviewers: craig.topper, hfinkel, efriedma
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D56883
llvm-svn: 351616
|
|
|
|
|
|
|
|
| |
This code is dead. There is no use of the feature in the entire LLVM codebase.
Differential Revision: https://reviews.llvm.org/D56939
llvm-svn: 351613
|
|
|
|
| |
llvm-svn: 351599
|
|
|
|
|
|
|
|
| |
This reverts commit r351582.
Bots are failing. Reverting this to fix and re-commit later.
llvm-svn: 351598
|
|
|
|
| |
llvm-svn: 351596
|
|
|
|
| |
llvm-svn: 351594
|
|
|
|
| |
llvm-svn: 351591
|
|
|
|
|
|
|
| |
It's taken 3 years, but now all of the old AMDGPU and SI intrinsics
are finally gone
llvm-svn: 351586
|
|
|
|
| |
llvm-svn: 351584
|
|
|
|
|
|
|
|
| |
going directly to MachineSDNode.
This sends these intrinsics through isel in a much more normal way. This should allow addressing mode matching in isel to make better use of the displacement field.
llvm-svn: 351583
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure CodeGenPrepare doesn't emit multiple inttoptr instructions of
the same integer value while sinking address computations, but rather
CSEs them on the fly: excessive inttoptr's confuse SCEV into thinking
that related pointers have nothing to do with each other.
This problem blocks LoadStoreVectorizer from vectorizing some of the
loads / stores in a downstream target.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D56838
llvm-svn: 351582
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch makes some changes related to -dag-dump-verbose.
Main use case has been when debugging how SelectionDAG is
dealing with debug info (SDDbgValue nodes).
1) We now print the number of DbgValues that are mapped to each
SDNode.
2) Removed duplicated printing of DebugLoc (nowadays DebugLoc is
printed also when not using -dag-dump-verbose).
3) Renamed SDDbgValue::dump to SDDbgValue::print, and added a
new SDDbgValue::dump that will start a new line after calling
print.
4) SDDbgValue::print now prints "Order", and it also prints
some additional information when kind is CONST/FRAMEIX/VREG.
5) SelectionDAG::dump() now dumps all SDDbgValue nodes after
the list of SDNodes (both "regular" and "ByVal" SDDbgValue:s).
Invalidated nodes are not printed.
6) Prohibit inline printing of SDNode operands that has SDDbgValue
nodes associated to them.
Reviewers: jmorse, aprantl
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D56793
llvm-svn: 351581
|
|
|
|
|
|
|
| |
The EXPENSIVE_CHECK x86_64 Windows buildbot is failing due to this change. Fix
the map access.
llvm-svn: 351577
|
|
|
|
| |
llvm-svn: 351574
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to D55073. Without this change, the DAG combiner crashes on code
with more than 64k of stores in a single basic block that form parallelizable
chains.
No test case, as it would be very IR file.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D56740
llvm-svn: 351571
|
|
|
|
|
|
|
|
|
|
| |
of going directly to MachineSDNode.:
This sends these intrinsics through isel in a much more normal way. This should allow addressing mode matching in isel to make better use of the displacement field.
Differential Revision: https://reviews.llvm.org/D56827
llvm-svn: 351570
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Scanning blocks in sub-loops for uses is unnecessary, as they were
already handled while dealing with the containing sub-loop.
This speeds up LCSSA for highly nested loops. For the test case in PR37202, it
halves the time spent in LCSSA. In cases were we won't be able to skip
any blocks, the additional lookup should be negligible.
Time-passes without this patch for test case from PR37202:
Total Execution Time: 48.5505 seconds (48.5511 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
10.0822 ( 21.0%) 0.1406 ( 27.0%) 10.2228 ( 21.1%) 10.2228 ( 21.1%) Loop-Closed SSA Form Pass
10.0417 ( 20.9%) 0.1467 ( 28.2%) 10.1884 ( 21.0%) 10.1890 ( 21.0%) Loop-Closed SSA Form Pass #2
4.2703 ( 8.9%) 0.0040 ( 0.8%) 4.2742 ( 8.8%) 4.2742 ( 8.8%) Unswitch loops
2.7376 ( 5.7%) 0.0229 ( 4.4%) 2.7605 ( 5.7%) 2.7611 ( 5.7%) Loop-Closed SSA Form Pass #5
2.7332 ( 5.7%) 0.0214 ( 4.1%) 2.7546 ( 5.7%) 2.7546 ( 5.7%) Loop-Closed SSA Form Pass #3
2.7088 ( 5.6%) 0.0230 ( 4.4%) 2.7319 ( 5.6%) 2.7324 ( 5.6%) Loop-Closed SSA Form Pass #4
2.6855 ( 5.6%) 0.0236 ( 4.5%) 2.7091 ( 5.6%) 2.7090 ( 5.6%) Loop-Closed SSA Form Pass #6
2.1648 ( 4.5%) 0.0018 ( 0.4%) 2.1666 ( 4.5%) 2.1664 ( 4.5%) Unroll loops
1.8371 ( 3.8%) 0.0009 ( 0.2%) 1.8379 ( 3.8%) 1.8380 ( 3.8%) Value Propagation
1.8149 ( 3.8%) 0.0021 ( 0.4%) 1.8170 ( 3.7%) 1.8169 ( 3.7%) Loop Invariant Code Motion
1.6755 ( 3.5%) 0.0226 ( 4.3%) 1.6981 ( 3.5%) 1.6980 ( 3.5%) Loop-Closed SSA Form Pass #7
Time-passes with this patch
Total Execution Time: 29.9285 seconds (29.9276 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
5.2786 ( 17.7%) 0.0021 ( 1.2%) 5.2806 ( 17.6%) 5.2808 ( 17.6%) Unswitch loops
4.3739 ( 14.7%) 0.0303 ( 18.1%) 4.4042 ( 14.7%) 4.4042 ( 14.7%) Loop-Closed SSA Form Pass
4.2658 ( 14.3%) 0.0192 ( 11.5%) 4.2850 ( 14.3%) 4.2851 ( 14.3%) Loop-Closed SSA Form Pass #2
2.2307 ( 7.5%) 0.0013 ( 0.8%) 2.2320 ( 7.5%) 2.2318 ( 7.5%) Loop Invariant Code Motion
2.0888 ( 7.0%) 0.0012 ( 0.7%) 2.0900 ( 7.0%) 2.0897 ( 7.0%) Unroll loops
1.6761 ( 5.6%) 0.0013 ( 0.8%) 1.6774 ( 5.6%) 1.6774 ( 5.6%) Value Propagation
1.3686 ( 4.6%) 0.0029 ( 1.8%) 1.3716 ( 4.6%) 1.3714 ( 4.6%) Induction Variable Simplification
1.1457 ( 3.8%) 0.0010 ( 0.6%) 1.1468 ( 3.8%) 1.1468 ( 3.8%) Loop-Closed SSA Form Pass #4
1.1384 ( 3.8%) 0.0005 ( 0.3%) 1.1389 ( 3.8%) 1.1389 ( 3.8%) Loop-Closed SSA Form Pass #6
1.1360 ( 3.8%) 0.0027 ( 1.6%) 1.1387 ( 3.8%) 1.1387 ( 3.8%) Loop-Closed SSA Form Pass #5
1.1331 ( 3.8%) 0.0010 ( 0.6%) 1.1341 ( 3.8%) 1.1340 ( 3.8%) Loop-Closed SSA Form Pass #3
Reviewers: davide, efriedma, mzolotukhin
Reviewed By: davide, efriedma
Subscribers: hiraditya, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D56848
llvm-svn: 351567
|
|
|
|
|
|
|
|
|
| |
This commit adds some missing intrinsics into the isAlwaysUniform list
for the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D56845
llvm-svn: 351562
|
|
|
|
|
|
|
|
| |
Defer inline asm's output fixup work until after we've generated the
inline asm node itself. Remove StoresToEmit, IndirectStoresToEmit, and
RetValRegs in favor of using ConstraintOperands.
llvm-svn: 351558
|
|
|
|
| |
llvm-svn: 351557
|
|
|
|
|
|
|
|
|
|
| |
See bug 39332: https://bugs.llvm.org/show_bug.cgi?id=39332
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D56794
llvm-svn: 351555
|
|
|
|
|
|
|
|
|
| |
This functionality is required at multiple places which potentially
create large operand lists, like SelectionDAGBuilder or DAGCombiner.
Differential Revision: https://reviews.llvm.org/D56739
llvm-svn: 351552
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up to r351448. It adds support for other _*Z extensions
of the Itanium demanling, to the newly available demangle function
heuristic.
Reviewed by: erik.pilkington, rupprecht, grimar
Differential Revision: https://reviews.llvm.org/D56855
llvm-svn: 351551
|
|
|
|
|
|
|
|
|
|
| |
See bug 39319: https://bugs.llvm.org/show_bug.cgi?id=39319
Reviewers: artem.tamazov, arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D56847
llvm-svn: 351549
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The operators simply print the underlying value or "None".
The trickier part of this patch is making sure the streaming operators
work even in unit tests (which was my primary motivation, though I can
also see them being useful elsewhere). Since the stream operator was a
template, implicit conversions did not kick in, and our gtest glue code
was explicitly introducing an implicit conversion to make sure other
implicit conversions do not kick in :P. I resolve that by specializing
llvm_gtest::StreamSwitch for llvm:Optional<T>.
Reviewers: sammccall, dblaikie
Reviewed By: sammccall
Subscribers: mgorny, dexonsmith, kristina, llvm-commits
Differential Revision: https://reviews.llvm.org/D56795
llvm-svn: 351548
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to
instructions of this pattern:
ld $GPR8, [PTR:XYZ]+
ld $GPR8, [PTR]+1
This has a problem; the [PTR] is incremented in-place once, but never
decremented.
Future uses of the same pointer will use the now clobbered value,
leading to the pointer being incorrect by an offset of one.
This patch modifies the expansion code of the LDWRdPtr pseudo
instruction so that the pointer variable is not silently clobbered in
future uses in the same live range.
Patch by Keshav Kini.
llvm-svn: 351544
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Use this helper to make sure we use the same value at various places.
This will likely be needed at more places were we currently crash
because we use more operands than possible.
Also makes it easier to change in the future.
Reviewers: RKSimon, craig.topper, efriedma, aemerson
Reviewed By: RKSimon
Subscribers: hiraditya, arsenm, llvm-commits
Differential Revision: https://reviews.llvm.org/D56859
llvm-svn: 351537
|
|
|
|
|
|
|
|
|
|
|
|
| |
We should not pre-scheduled the node has ADJCALLSTACKDOWN parent,
or else, when bottom-up scheduling, ADJCALLSTACKDOWN and
ADJCALLSTACKUP may hold CallResource too long and make other
calls can't be scheduled. If there's no other available node
to schedule, the scheduler will try to rename the register by
creating copy to avoid the conflict which will fail because
CallResource is not a real physical register.
llvm-svn: 351527
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CBR instruction is just an ANDI instruction with the immediate
complemented.
Because of this, prior to this change TableGen would warn due to a
decoding conflict.
This commit fixes the existing compilation warning:
===============
[423/492] Building AVRGenDisassemblerTables.inc...
Decoding Conflict:
0111............
01..............
................
ANDIRdK 0111____________
CBRRdK 0111____________
================
After this commit, there are no more decoding conflicts in the AVR
backend's instruction definitions.
Thanks to Eli F for pointing me torward `t2_so_imm_not` as an example of
how to perform a complement in an instruction alias.
Fixes BugZilla PR38802.
llvm-svn: 351526
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove DBG_LABELs in LiveDebugVariables and generate them in
VirtRegRewriter.
This bug is reported in
https://bugs.chromium.org/p/chromium/issues/detail?id=898152.
Differential Revision: https://reviews.llvm.org/D54465
llvm-svn: 351525
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hardware MUL
This change modifies the LLVM ISel lowering settings so that
8-bit/16-bit multiplication is expanded to calls into the compiler
runtime library if the MCU being targeted does not support
multiplication in hardware.
Before this, MUL instructions would be generated on CPUs like the
ATtiny85, triggering a CPU reset due to an illegal instruction at
runtime.
First raised in https://github.com/avr-rust/rust/issues/124.
llvm-svn: 351523
|
|
|
|
| |
llvm-svn: 351520
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: aheejin, dschuff, sbc100
Subscribers: aprantl, jgravelle-google, hiraditya, sunfish
Differential Revision: https://reviews.llvm.org/D56889
llvm-svn: 351507
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to r348205, extracting code regions with live output values was
disabled because of a miscompilation (PR39433). Lift the restriction as
PR39433 has been addressed.
Tested on LNT+externals, on a run of check-llvm in a stage2 build, and
with a full build of iOS (with hot/cold splitting enabled).
As a drive-by, remove an errant TODO.
llvm-svn: 351492
|
|
|
|
|
|
|
|
|
|
| |
Resuming exception unwinding is roughly as unlikely as throwing an
exception.
Tested on LNT+externals (in particular, the C++ EH regression tests
provide end-to-end test coverage), as well as with a full build of iOS.
llvm-svn: 351491
|