| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
just scalar i1.
llvm-svn: 299665
|
|
|
|
|
|
|
| |
Keep full offset value on MI-level instructions, but have it scaled down
in the MC-level instructions.
llvm-svn: 299664
|
|
|
|
|
|
| |
Use new values of the dimensions during their permutation.
llvm-svn: 299663
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dimensions of band nodes can be implicitly permuted by the algorithm applied
during the schedule generation.
For example, in case of the following matrix-matrix multiplication,
for (i = 0; i < 1024; i++)
for (k = 0; k < 1024; k++)
for (j = 0; j < 1024; j++)
C[i][j] += A[i][k] * B[k][j];
it can produce the following schedule tree
domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and
0 <= i2 <= 1023 }"
child:
schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] },
{ Stmt_for_body6[i0, i1, i2] -> [(i1)] },
{ Stmt_for_body6[i0, i1, i2] -> [(i2)] }]"
permutable: 1
coincident: [ 1, 1, 0 ]
The current implementation of the pattern matching optimizations relies on the
initial ordering of dimensions. Otherwise, it can produce the miscompilation
(e.g., [1]).
This patch helps to restore the initial ordering of dimensions by recreating
the band node when the corresponding conditions are satisfied.
Refs.:
[1] - https://bugs.llvm.org/show_bug.cgi?id=32500
Reviewed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D31741
llvm-svn: 299662
|
|
|
|
|
|
|
|
| |
r299658 fixed a case where InstCombine was replicating instructions instead of combining. Fixing this reduced the number of pushes and pops in the __tsan_read and __tsan_write functions.
Adjust the expectations to account for this after talking to Dmitry Vyukov.
llvm-svn: 299661
|
|
|
|
|
|
|
|
| |
This will be used in LCSSA to speed up the canonicalization.
Differential Revision: https://reviews.llvm.org/D31694
llvm-svn: 299660
|
|
|
|
|
|
|
|
|
|
| |
If a workgroup size is known to be not greater than wavefront size
the s_barrier instruction is not needed since all threads are guarantied
to come to the same point at the same time.
Differential Revision: https://reviews.llvm.org/D31731
llvm-svn: 299659
|
|
|
|
|
|
| |
single use resulting in extra instructions being created.
llvm-svn: 299658
|
|
|
|
| |
llvm-svn: 299657
|
|
|
|
|
|
|
|
| |
and proposing this as https://reviews.llvm.org/D16541"
This reverts commit r299652, 32bits MacOS is broken.
llvm-svn: 299656
|
|
|
|
| |
llvm-svn: 299655
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D31671
llvm-svn: 299654
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hopefully fix crashes by unshadowing the variable.
Original commit message:
A big part of the clone detection code is functionality for filtering clones and
clone groups based on different criteria. So far this filtering process was
hardcoded into the CloneDetector class, which made it hard to understand and,
ultimately, to extend.
This patch splits the CloneDetector's logic into a sequence of reusable
constraints that are used for filtering clone groups. These constraints
can be turned on and off and reodreder at will, and new constraints are easy
to implement if necessary.
Unit tests are added for the new constraint interface.
This is a refactoring patch - no functional change intended.
Patch by Raphael Isemann!
Differential Revision: https://reviews.llvm.org/D23418
llvm-svn: 299653
|
|
|
|
|
|
| |
proposing this as https://reviews.llvm.org/D16541
llvm-svn: 299652
|
|
|
|
| |
llvm-svn: 299651
|
|
|
|
|
|
| |
rdar://20441985
llvm-svn: 299650
|
|
|
|
| |
llvm-svn: 299649
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
object types should be preferred over conversions to other object pointers
This change ensures that Clang will select the correct overload for the
following code sample:
void overload(Base *b);
void overload(Derived *d);
void test(Base<Base *> b) {
overload(b); // Select overload(Base *), not overload(Derived *)
}
rdar://20124827
Differential Revision: https://reviews.llvm.org/D31597
llvm-svn: 299648
|
|
|
|
|
|
|
|
|
| |
Since the BUILD_VECTOR has already been checked by
isBuildVectorOfConstantSDNodes() in SelectionDAG::getNode() for a
SIGN_EXTEND_INREG, it can be assumed that Op is always either undef or a
ConstantSDNode, and Ops.size() will always equal VT.getVectorNumElements().
llvm-svn: 299647
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lambda capture used by the created block
The commit r288866 introduced guaranteed copy elision to C++ 17. This
unfortunately broke the lambda to block conversion in C++17 (the compiler
crashes when performing IRGen). This commit fixes the conversion by avoiding
copy elision for the capture that captures the lambda that's used in the block
created by the lambda to block conversion process.
rdar://31385153
Differential Revision: https://reviews.llvm.org/D31669
llvm-svn: 299646
|
|
|
|
| |
llvm-svn: 299645
|
|
|
|
|
|
|
|
| |
The local was only referenced in assertions.
Follow-up to D31345.
llvm-svn: 299644
|
|
|
|
|
|
|
|
| |
Attempt to satisfy llvm-clang-x86_64-expensive-checks-win by targeting
x86_64-apple-darwin10 for Sema/vector-ops.c. The underlying failure is
due to datatype differences between platforms.
llvm-svn: 299643
|
|
|
|
| |
llvm-svn: 299642
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This improves some error messages which would otherwise refer to
ext_vector_type types in contexts where there are no such types.
Factored out from D25866 at reviewer's request.
Reviewers: bruno
Differential Revision: https://reviews.llvm.org/D31667
llvm-svn: 299641
|
|
|
|
| |
llvm-svn: 299640
|
|
|
|
| |
llvm-svn: 299639
|
|
|
|
|
|
|
|
| |
Patch by András Leitereg!
Differential Revision: https://reviews.llvm.org/D30547
llvm-svn: 299638
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
anonymous namespaces
Summary: This resolves the issue of tablegen-erated includes in the headers for non-GlobalISel builds in a simpler way than before.
Reviewers: qcolombet, ab
Reviewed By: ab
Subscribers: igorb, ab, mgorny, dberris, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30998
llvm-svn: 299637
|
|
|
|
| |
llvm-svn: 299636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Executable sections should not be padded with zero by default. On some
architectures, 0x00 is the start of a valid instruction sequence, so can confuse
disassembly between InputSections (and indeed the start of the next InputSection
in some situations). Further, in the case of misjumps into padding, padding may
start to be executed silently.
On x86, the "0xcc" byte represents the int3 trap instruction. It is a single
byte long so can serve well as padding. This change switches x86 (and x86_64) to
use this value for padding in executable sections, if no linker script directive
overrides it. It also puts the behaviour into place making it easy to change the
behaviour of other targets when desired. I do not know the relevant instruction
sequences for trap instructions on other targets however, so somebody should add
this separately.
Because the old behaviour simply wrote padding in the whole section before
overwriting most of it, this change also modifies the padding algorithm to write
padding only where needed. This in turn has caused a small behaviour change with
regards to what values are written via Fill commands in linker scripts, bringing
it into line with ld.bfd. The fill value is now written starting from the end of
the previous block, which means that it always starts from the first byte of the
fill, whereas the old behaviour meant that the padding sometimes started mid-way
through the fill value. See the test changes for more details.
Reviewed by: ruiu
Differential Revision: https://reviews.llvm.org/D30886
Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32227
llvm-svn: 299635
|
|
|
|
|
|
|
|
|
| |
During the optimisation of jump tables in the constant island pass,
an extra ADD could be left over, now dead but not removed.
Differential Revision: https://reviews.llvm.org/D31389
llvm-svn: 299634
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because Polly exposes parameters that directly influence tile size
calculations, one can setup situations like divide-by-zero.
Check against a possible divide-by-zero in getMacroKernelParams
and return early.
Also assert at the end of getMacroKernelParams that the block sizes
computed for matrices are positive (>= 1).
Tags: #polly
Differential Revision: https://reviews.llvm.org/D31708
llvm-svn: 299633
|
|
|
|
| |
llvm-svn: 299632
|
|
|
|
| |
llvm-svn: 299631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses two issues:
* It turned out that suspended thread may have dtls->dtv_size == kDestroyedThread (-1)
and LSan wrongly assumes that DTV is available. This leads to SEGV when LSan tries to
iterate through DTV that is invalid.
* In some rare cases GetRegistersAndSP can fail with errno 3 (ESRCH). In this case LSan
assumes that the whole stack of a given thread is available. This is wrong because ESRCH
can indicate that suspended thread was destroyed and its stack was unmapped. This patch
properly handles ESRCH from GetRegistersAndSP in order to avoid invalid accesses to already
unpapped threads stack.
Differential Revision: https://reviews.llvm.org/D30818
llvm-svn: 299630
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
"short" is defined as an xray flag, and buffer rewinding happens for both exits
and tail exits.
I've made the choice to seek backwards finding pairs of FunctionEntry, TailExit
record pairs and erasing them if the FunctionEntry occurred before exit from the
currently exiting function. This is a compromise so that we don't skip logging
tail calls if the function that they call into takes longer our duration.
This works by counting the consecutive function and function entry, tail exit
pairs that proceed the current point in the buffer. The buffer is rewound to
check whether these entry points happened recently enough to be erased.
It is still possible we will omit them if they call into a child function that
is not instrumented which calls a fast grandchild that is instrumented before
doing other processing.
Reviewers: pelikan, dberris
Reviewed By: dberris
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31345
llvm-svn: 299629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Need to save `lr` before bl to aeabi_div0
Reviewers: rengolin, compnerd
Reviewed By: compnerd
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31716
llvm-svn: 299628
|
|
|
|
|
|
| |
can simplify in one direction but not the other.
llvm-svn: 299627
|
|
|
|
|
|
| |
can be treated as Xor too.
llvm-svn: 299626
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some CRT APIs are unavailable for Windows Store apps [1]. Detect when
we're targeting the Windows Store and don't try to refer to non-existent
CRT functions in that case. (This would otherwise lead to a compile
error when using the libc++ headers and compiling for Windows Store.)
[1] https://docs.microsoft.com/en-us/cpp/cppcx/crt-functions-not-supported-in-universal-windows-platform-apps
Differential Revision: https://reviews.llvm.org/D31737
llvm-svn: 299625
|
|
|
|
| |
llvm-svn: 299624
|
|
|
|
| |
llvm-svn: 299623
|
|
|
|
| |
llvm-svn: 299622
|
|
|
|
| |
llvm-svn: 299621
|
|
|
|
|
|
| |
that are already present. NFC
llvm-svn: 299620
|
|
|
|
| |
llvm-svn: 299619
|
|
|
|
| |
llvm-svn: 299618
|
|
|
|
|
|
| |
This is a regular maintenance update.
llvm-svn: 299617
|
|
|
|
|
|
| |
Just a spelling change in a comment intended to test svn commit access.
llvm-svn: 299616
|