| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The patch fixes PR28946.
Reviewers: majnemer, sanjoy
Differential Revision: http://reviews.llvm.org/D23296
From: Li Huang
llvm-svn: 279684
|
|
|
|
|
|
| |
Test for commit access.
llvm-svn: 279683
|
|
|
|
|
|
| |
Missed two lines got lost when cherry picking old commits to master.
llvm-svn: 279682
|
|
|
|
| |
llvm-svn: 279681
|
|
|
|
| |
llvm-svn: 279680
|
|
|
|
| |
llvm-svn: 279679
|
|
|
|
| |
llvm-svn: 279678
|
|
|
|
|
|
| |
constant vectors
llvm-svn: 279677
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
print/parser it
tracksSubRegLiveness only depends on the Subtarget and a cl::opt, there
is not need to change it or save/parse it in a .mir file.
Make the field const and move the initialization LiveIntervalAnalysis to the
MachineRegisterInfo constructor. Also cleanup some code and fix some
instances which better use MachineRegisterInfo::subRegLivenessEnabled() instead
of TargetSubtargetInfo::enableSubRegLiveness().
llvm-svn: 279676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The point of this patch is to have a consistent convention for naming build, check and install targets so that the targets can be constructed from the project name.
This change renames a bunch of CMake components and targets from libcxx to cxx. For each renamed target I've added a convenience target that matches the old target name and depends on the new target. This will preserve function of the old targets so that the change doesn't break the world. We can evaluate if it is worth removing the extra targets later.
Reviewers: EricWF
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D23699
llvm-svn: 279675
|
|
|
|
|
|
| |
the debian bot, hopefully this is a fix.
llvm-svn: 279674
|
|
|
|
|
|
| |
The oneshot probe only gets executed the first time the probe is hit in the process. For order file generation this is really all we care about.
llvm-svn: 279673
|
|
|
|
|
|
|
|
| |
The newer event-based tests I added neglected to do the
macOS 10.12 check in the setup. This caused earlier macOS
test suite runs to attempt to compile code that doesn't exist.
llvm-svn: 279672
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following function currently relies on tail-merging for if
conversion to succeed. The common tail of cond_true and cond_false is
extracted, and this then forms a diamond pattern that can be
successfully if converted.
If this block does not get extracted, either because tail-merging is
disabled or the threshold is higher, we should still recognize this
pattern and if-convert it.
Fixed a regression in the original commit. Need to un-reverse branches after
reversing them, or other conversions go awry.
define i32 @t2(i32 %a, i32 %b) nounwind {
entry:
%tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1]
br i1 %tmp1434, label %bb17, label %bb.outer
bb.outer: ; preds = %cond_false, %entry
%b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ]
%a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ]
br label %bb
bb: ; preds = %cond_true, %bb.outer
%indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ]
%tmp. = sub i32 0, %b_addr.021.0.ph
%tmp.40 = mul i32 %indvar, %tmp.
%a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph
%tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph
br i1 %tmp3, label %cond_true, label %cond_false
cond_true: ; preds = %bb
%tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph
%tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph
%indvar.next = add i32 %indvar, 1
br i1 %tmp1437, label %bb17, label %bb
cond_false: ; preds = %bb
%tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0
%tmp14 = icmp eq i32 %a_addr.026.0, %tmp10
br i1 %tmp14, label %bb17, label %bb.outer
bb17: ; preds = %cond_false, %cond_true, %entry
%a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ]
ret i32 %a_addr.026.1
}
Without tail-merging or diamond-tail if conversion:
LBB1_1: @ %bb
@ =>This Inner Loop Header: Depth=1
cmp r0, r1
ble LBB1_3
@ BB#2: @ %cond_true
@ in Loop: Header=BB1_1 Depth=1
subs r0, r0, r1
cmp r1, r0
it ne
cmpne r0, r1
bgt LBB1_4
LBB1_3: @ %cond_false
@ in Loop: Header=BB1_1 Depth=1
subs r1, r1, r0
cmp r1, r0
bne LBB1_1
LBB1_4: @ %bb17
bx lr
With diamond-tail if conversion, but without tail-merging:
@ BB#0: @ %entry
cmp r0, r1
it eq
bxeq lr
LBB1_1: @ %bb
@ =>This Inner Loop Header: Depth=1
cmp r0, r1
ite le
suble r1, r1, r0
subgt r0, r0, r1
cmp r1, r0
bne LBB1_1
@ BB#2: @ %bb17
bx lr
llvm-svn: 279671
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cost of predicating a diamond is only the instructions that are not shared
between the two branches. Additionally If a predicate clobbering instruction
occurs in the shared portion of the branches (e.g. a cond move), it may still
be possible to if convert the sub-cfg. This change handles these two facts by
rescanning the non-shared portion of a diamond sub-cfg to recalculate both the
predication cost and whether both blocks are pred-clobbering.
Fixed 2 bugs before recommitting. Branch instructions must be compared and found
identical before diamond conversion. Also, predicate-clobbering instructions in
the shared prefix disqualifies a potential diamond conversion. Includes tests
for both.
llvm-svn: 279670
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This more clearly describes what the class is.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23851
llvm-svn: 279669
|
|
|
|
|
|
|
|
| |
per-frame stack usage enough to cause it to hit our stack limit. This is not
ideal; we should find a better way of dealing with this, such as increasing
our stack allocation when built with ASan.
llvm-svn: 279668
|
|
|
|
|
|
| |
initializer of an imported field.
llvm-svn: 279667
|
|
|
|
|
|
| |
alternative solution
llvm-svn: 279666
|
|
|
|
|
|
|
|
| |
A branch-distance to a Thumb function shouldn't be forced to be odd for
CBZ/CBNZ instructions because (assuming it's within range), it's going to be a
valid, even offset.
llvm-svn: 279665
|
|
|
|
|
|
| |
64-bit allocator to use a single array for free-d chunks instead of a lock-free linked list of tranfer batches. This change simplifies the code, makes the allocator more 'hardened', and will allow simpler code to release RAM to OS. This may also slowdown malloc stress tests due to lock contension, but I did not observe noticeable slowdown on various real multi-threaded benchmarks.
llvm-svn: 279664
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Give appropriate warnings with -Wdocumentation for @param comments
that refer to function aliases defined with 'using'. Very similar
to typedef's behavior. This does not add support for
TypeAliasTemplateDecl yet.
Differential Revision: https://reviews.llvm.org/D23783
rdar://problem/27300695
llvm-svn: 279662
|
|
|
|
|
|
| |
allocator
llvm-svn: 279661
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch implements readlane/readfirstlane intrinsics.
TODO: need to define a new register class to consider the case
that the source could be a vector register or M0.
Reviewed by:
arsenm and tstellarAMD
Differential Revision:
http://reviews.llvm.org/D22489
llvm-svn: 279660
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D23815
llvm-svn: 279659
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The return value from PlatformExecutor::allocateDeviceMemory needs to be
converted from Expected<GlobalDeviceMemoryBase> to
Expected<GlobalDeviceMemory<T>> in Executor::allocateDeviceMemory.
A similar bug is also fixed for Executor::allocateHostMemory.
Thanks to jprice for identifying this bug.
Reviewers: jprice, jlebar
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23849
llvm-svn: 279658
|
|
|
|
|
|
| |
Required for out-of-tree builds of Polly.
llvm-svn: 279657
|
|
|
|
| |
llvm-svn: 279656
|
|
|
|
| |
llvm-svn: 279655
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Consolidate Executor::synchronousCopy* and Stream::thenCopy* methods into
Doxygen method groups and combine all their comments into one section.
Also a "doc" target to the build files to use Doxygen to build the
documentation.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23845
llvm-svn: 279654
|
|
|
|
|
|
| |
Windows require that.
llvm-svn: 279653
|
|
|
|
|
|
| |
These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad
llvm-svn: 279652
|
|
|
|
|
|
|
|
|
|
| |
skeleton CU
In cases where .dwo/.dwp files are guaranteed to be available, skipping
the extra online (in the .o file) inline info can save a substantial
amount of space - see the original r221306 for more details there.
llvm-svn: 279651
|
|
|
|
|
|
|
|
|
|
| |
skeleton CU
In cases where .dwo/.dwp files are guaranteed to be available, skipping
the extra online (in the .o file) inline info can save a substantial
amount of space - see the original r221306 for more details there.
llvm-svn: 279650
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch unifies the data structures we use for mapping instructions from the
original loop to their corresponding instructions in the new loop. Previously,
we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap.
WidenMap maintained the vector values each instruction from the old loop was
represented with, and ScalarIVMap maintained the scalar values each scalarized
induction variable was represented with. With this patch, all values created
for the new loop are maintained in VectorLoopValueMap.
The change allows for several simplifications. Previously, when an instruction
was scalarized, we had to insert the scalar values into vectors in order to
maintain the mapping in WidenMap. Then, if a user of the scalarized value was
also scalar, we had to extract the scalar values from the temporary vector we
created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions.
If a scalarized value is used by a scalar instruction, the scalar value is used
directly. However, if the scalarized value is needed by a vector instruction,
we generate the needed insertelement instructions on-demand.
A common idiom in several locations in the code (including the scalarization
code), is to first get the vector values an instruction from the original loop
maps to, and then extract a particular scalar value. This patch adds
getScalarValue for this purpose along side getVectorValue as an interface into
VectorLoopValueMap. These functions work together to return the requested
values if they're available or to produce them if they're not.
The mapping has also be made less permissive. Entries can be added to
VectorLoopValue map with the new initVector and initScalar functions.
getVectorValue has been modified to return a constant reference to the mapped
entries.
There's no real functional change with this patch; however, in some cases we
will generate slightly different code. For example, instead of an insertelement
sequence following the definition of an instruction, it will now precede the
first use of that instruction. This can be seen in the test case changes.
Differential Revision: https://reviews.llvm.org/D23169
llvm-svn: 279649
|
|
|
|
|
|
| |
Enable zero cycle zeroing.
llvm-svn: 279648
|
|
|
|
|
|
|
|
| |
I'm not sure if the `!isa<CallInst>(Inst) &&
!isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the
case we know is broken.
llvm-svn: 279647
|
|
|
|
|
|
|
|
|
|
| |
Includes adding more general support for the pattern: VZEXT_MOVL(VZEXT_LOAD(ptr)) -> VZEXT_LOAD(ptr)
This has unearthed a couple of latent poor codegen issues (MINSS/MAXSS scalar load folding and MOVDDUP/BROADCAST load folding patterns), which will be fixed shortly.
Its also reduced a couple of tests so that they no longer reach the instruction threshold necessary to be combined to PSHUFB (see PR26183).
llvm-svn: 279646
|
|
|
|
| |
llvm-svn: 279644
|
|
|
|
|
|
| |
This reverts commit r279572 and r279595.
llvm-svn: 279643
|
|
|
|
| |
llvm-svn: 279642
|
|
|
|
| |
llvm-svn: 279641
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add Executor methods that block the host until completion. Since these
methods are host-synchronous, they don't require Stream arguments.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D23577
llvm-svn: 279640
|
|
|
|
| |
llvm-svn: 279639
|
|
|
|
| |
llvm-svn: 279638
|
|
|
|
|
|
|
| |
This is no longer necessary, because since r279625 the subregister
liveness properly accounts for read-undefs.
llvm-svn: 279637
|
|
|
|
| |
llvm-svn: 279636
|
|
|
|
| |
llvm-svn: 279635
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the host binary format.
Summary:
This patch adds the capability to bundle object files in sections of the host binary using a designated naming convention for these sections. This patch uses the functionality of the object reader already in the LLVM library to read bundled files, and invokes clang with the incremental linking options to create bundle files.
Bundling files involves creating an IR file with the contents of the bundle assigned as initializers of globals binded to the designated sections. This way the bundling implementation is agnostic of the host object format.
The features added by this patch were requested in the RFC discussion in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html.
Reviewers: echristo, tra, jlebar, hfinkel, ABataev, Hahnfeld
Subscribers: mkuron, whchung, cfe-commits, andreybokhanko, Hahnfeld, arpith-jacob, carlo.bertolli, mehdi_amini, caomhin
Differential Revision: https://reviews.llvm.org/D21851
llvm-svn: 279634
|
|
|
|
| |
llvm-svn: 279633
|