| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 280603
|
|
|
|
|
|
|
| |
Use the wrapper API in IRBuilder that does meta data copy
to create new branch in LoopUnswitch.
llvm-svn: 280602
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... but instead rely on the assumptions that we derive for load/store
instructions.
Before we were able to delinearize arrays, we used GEP pointer instructions
to derive information about the likely range of induction variables, which
gave us more freedom during loop scheduling. Today, this is not needed
any more as we delinearize multi-dimensional memory accesses and as part
of this process also "assume" that all accesses to these arrays remain
inbounds. The old derive-assumptions-from-GEP code has consequently become
mostly redundant. We drop it both to clean up our code, but also to improve
compile time. This change reduces the scop construction time for 3mm in
no-asserts mode on my machine from 48 to 37 ms.
llvm-svn: 280601
|
|
|
|
|
|
|
|
|
|
| |
CGP currently drops select's MD_prof profile data when
generating conditional branch which can lead to bad
code layout. The patch fixes the issue.
Differential Revision: http://reviews.llvm.org/D24169
llvm-svn: 280600
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because the recent change about ODR type uniquing in the context,
we can reach types defined in another module during IR linking.
This triggered some assertions in case we IR link without starting
from an empty module. To alleviate that, we can self-map metadata
defined in the destination module so that they won't be visited.
Differential Revision: https://reviews.llvm.org/D23841
llvm-svn: 280599
|
|
|
|
| |
llvm-svn: 280598
|
|
|
|
| |
llvm-svn: 280597
|
|
|
|
| |
llvm-svn: 280596
|
|
|
|
| |
llvm-svn: 280595
|
|
|
|
|
|
|
|
| |
I'm not sure if this should be considered a bug in
copyImplicitOps or not, but implicit operands that are part
of the static instruction definition should not be copied.
llvm-svn: 280594
|
|
|
|
|
|
| |
AVX512 stack folding test.
llvm-svn: 280593
|
|
|
|
|
|
| |
equivalent.
llvm-svn: 280592
|
|
|
|
| |
llvm-svn: 280591
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This contains two changes that reduce the time spent in WQM, with the
intention of reducing bandwidth required by VMEM loads:
1. Sampling instructions by themselves don't need to run in WQM, only their
coordinate inputs need it (unless of course there is a dependent sampling
instruction). The initial scanInstructions step is modified accordingly.
2. When switching back from WQM to Exact, switch back as soon as possible.
This affects the logic in processBlock.
This should always be a win or at best neutral.
There are also some cleanups (e.g. remove unused ExecExports) and some new
debugging output.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D22092
llvm-svn: 280590
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a rare bug in polygon stippling with non-monolithic pixel shaders.
The underlying problem is as follows: the prolog part contains the polygon
stippling sequence, i.e. a kill. The main part then enables WQM based on the
_reduced_ exec mask, effectively undoing most of the polygon stippling.
Since we cannot know whether polygon stippling will be used, the main part
of a non-monolithic shader must always return to exact mode to fix this
problem.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23131
llvm-svn: 280589
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
context.
Summary:
This patch allows threads not created using `std::thread` to use `std::notify_all_at_thread_exit` by ensuring the TL state has been initialized within `std::notify_all_at_thread_exit`.
Additionally this patch "fixes" a potential oddity in `__thread_local_pointer::reset(pointer)`, which would previously delete the old thread local data. However there should *never* be old thread local data because pthread *should* null it out on thread exit. Unfortunately it's possible that pthread failed to do this according to the spec:
>
> Upon key creation, the value NULL shall be associated with the new key in all active threads. Upon thread creation, the value NULL shall be associated with all defined keys in the new thread.
>
> An optional destructor function may be associated with each key value. At thread exit, if a key value has a non-NULL destructor pointer, and the thread has a non-NULL value associated with that key, the value of the key is set to NULL, and then the function pointed to is called with the previously associated value as its sole argument. The order of destructor calls is unspecified if more than one destructor exists for a thread when it exits.
>
> If, after all the destructors have been called for all non-NULL values with associated destructors, there are still some non-NULL values with associated destructors, then the process is repeated. If, after at least {PTHREAD_DESTRUCTOR_ITERATIONS} iterations of destructor calls for outstanding non-NULL values, there are still some non-NULL values with associated destructors, implementations may stop calling destructors, or they may continue calling destructors until no non-NULL values with associated destructors exist, even though this might result in an infinite loop.
However if pthread fails to delete the value it is probably incorrect for us to do it. Destroying the value performs all of the "at thread exit" actions registered with it but we are way past "at thread exit".
Reviewers: mclow.lists, bcraig, EricWF
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D24159
llvm-svn: 280588
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D23957
llvm-svn: 280587
|
|
|
|
|
|
|
| |
This allows more of the OCML builtin library to be
constant folded.
llvm-svn: 280586
|
|
|
|
| |
llvm-svn: 280585
|
|
|
|
|
|
|
|
|
| |
readlane/writelane do not support using m0 as the output/input.
Constrain the register class of spill vregs to try to avoid this,
but also handle spilling of the physreg when necessary by inserting
an additional copy to a normal SGPR.
llvm-svn: 280584
|
|
|
|
| |
llvm-svn: 280583
|
|
|
|
| |
llvm-svn: 280581
|
|
|
|
| |
llvm-svn: 280580
|
|
|
|
| |
llvm-svn: 280579
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some Windows SDK classes, for example
Windows::Storage::Streams::IBufferByteAccess, use the ATL way of spelling
attributes:
[uuid("....")] class IBufferByteAccess {};
To be able to use __uuidof() to grab the uuid off these types, clang needs to
support uuid as a Microsoft attribute. There was already code to skip Microsoft
attributes, extend that to look for uuid and parse it. Use the new "Microsoft"
attribute type added in r280575 (and r280574, r280576) for this.
Final part of https://reviews.llvm.org/D23895
llvm-svn: 280578
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test it added doesn't pass:
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15318/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Apdbdump-yaml-types.test
Command Output (stdout):
--
$ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\llvm-pdbdump.EXE" "pdb2yaml" "-tpi-stream" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB/Inputs/empty.pdb"
$ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\FileCheck.EXE" "-check-prefix=YAML" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test"
# command stderr:
D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test:36:7: error: expected string not found in input
YAML: Name: apartment
^
<stdin>:153:10: note: scanning from here
Value: 161
^
llvm-svn: 280577
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was already a function that moved attributes off the declspec into
an attribute list for attributes applying to the type, teach that function to
also move Microsoft attributes around and rename it to match its new broader
role.
Nothing uses Microsoft attributes yet, so no behavior change.
Part of https://reviews.llvm.org/D23895
llvm-svn: 280576
|
|
|
|
|
|
|
|
| |
This is for attributes in []-delimited lists preceding a class, like e.g.
`[uuid("...")] class Foo {};` Not used by anything yet, so no behavior change.
Part of https://reviews.llvm.org/D23895
llvm-svn: 280575
|
|
|
|
|
|
|
|
| |
into ParseDeclOrFunctionDefInternal() (which is called by
MaybeParseMicrosoftAttributes()), so that the attributes can be stored in
the DeclSpec. No behavior change yet, part of https://reviews.llvm.org/D23895
llvm-svn: 280574
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only intrusive thing about SparseBitVector's usage of ilist<> was
that new was usually called externally. There were no custom traits.
It seems like the reason to switch to ilist in r41855 was to avoid
pointer invalidation, but std::list<> has that feature too. Maybe
std::list<>::emplace makes this a little more obvious than it was then.
Switch over to std::list<> and simplify the code.
llvm-svn: 280573
|
|
|
|
|
|
|
|
|
| |
The comment starting with "ParseDeclarationOrFunctionDefinition -" is above
a function called ParseDeclOrFunctionDefInternal. Fix the comment by not
mentioning a function name, like the style guide requests nowadays. No behavior
change.
llvm-svn: 280572
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PowerPC assembly code in the wild, so it seems, has things like this:
bc+ 12, 28, .L9
This is a bit odd because the '+' here becomes part of the BO field, and the BO
field is otherwise the first operand. Nevertheless, the ISA specification does
clearly say that the +- hint syntax applies to all conditional-branch mnemonics
(that test either CTR or a condition register, although not the forms which
check both), both basic and extended, so this is supposed to be valid.
This introduces some asm-parser-only definitions which take only the upper
three bits from the specified BO value, and the lower two bits are implied by
the +- suffix (via some associated aliases).
Fixes PR23646.
llvm-svn: 280571
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inheriting from std::iterator uses more boiler-plate than manual
typedefs. Avoid that in both ilist_iterator and
MachineInstrBundleIterator.
This has the side effect of removing ilist_iterator from certain ADL
lookups in namespace std; calls to std::next need to be qualified by
"std::" that didn't have to before. The one case of this in-tree was
operating on a temporary, so I used the more compact operator++.
llvm-svn: 280570
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split out iplist_impl from iplist, and change SymbolTableList to inherit
directly from iplist_impl. This makes it more straightforward to add
new template paramaters to iplist [*]:
- iplist_impl takes a "base" list that provides the intrusive
functionality (usually simple_ilist<T>) and a traits class.
- iplist no longer takes a "Traits" template parameter. It only takes
the value_type, T, and instantiates iplist_impl with simple_ilist<T>
and ilist_traits<T>.
- SymbolTableList now inherits from iplist_impl, instead of iplist.
Note for out-of-tree code: if you have an iplist whose second template
parameter was *not* the default (i.e., not ilist_traits<YourT>), you
have three options:
- Stop using a custom traits class, and instead specialize
ilist_traits<YourT>. This is the usual thing to do.
- Specialize iplist<YourT> to pass your custom traits class into
iplist_impl.
- Create your own trivial list type that passes your custom traits class
into iplist_impl (see SymbolTableList<> for an example).
[*]: The eventual goal is to start tracking a sentinel bit on the
MachineInstr list even when LLVM_ENABLE_ABI_BREAKING_CHECKS is off,
which will enable MachineBasicBlock::reverse_iterator to have normal
list invalidation semantics that matching the new
iplist<>::reverse_iterator from r280032.
llvm-svn: 280569
|
|
|
|
|
|
| |
Add -mtriple=x86_64-unknown-linux-gnu for the test and move it to CodeGen/X86.
llvm-svn: 280568
|
|
|
|
|
|
| |
And use other typedefs so that the next rename has a smaller diff.
llvm-svn: 280567
|
|
|
|
|
|
|
| |
Adopt r280128 in lld, specializing ilist_alloc_traits rather than
reinventing the wheel.
llvm-svn: 280566
|
|
|
|
|
|
|
|
|
|
|
|
| |
Delete the dead code for Write(ilist_iterator) in the IR Verifier,
inline report(ilist_iterator) at its call sites in the MachineVerifier,
and use simple_ilist<>::iterator in SymbolTableListTraits.
The only remaining reference to ilist_iterator outside of the ilist
implementation is from MachineInstrBundleIterator. I'll get rid of that
in a follow-up.
llvm-svn: 280565
|
|
|
|
|
|
|
|
| |
This test was using the wrong type, and so not actually testing much.
ilist_iterator constructors weren't going through ilist_node_access, so
they didn't actually work with private inheritance.
llvm-svn: 280564
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Simple utility methods will prevent users from making mistakes when
converting element counts to byte counts.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24197
llvm-svn: 280563
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have invariants we like to guarantee for the
`ImplicitConversionKind`s in a `StandardConversionSequence`. These
weren't being upheld in code that r280553 touched, so Richard suggested
that we should fix that. See D24113.
I'm not entirely sure how to go about testing this, so no test case is
included. Suggestions welcome.
llvm-svn: 280562
|
|
|
|
|
|
| |
__attribute__((require_constant_initialization)), and apply it to memory_resource
llvm-svn: 280561
|
|
|
|
|
|
|
|
| |
These few book-III instructions are used by the Linux kernel.
Partially fixes PR24796.
llvm-svn: 280560
|
|
|
|
|
|
|
|
|
| |
dcbf has an optional hint-like field, add support for the extended form and the
associated mnemonics (dcbfl and dcbflp).
Partially fixes PR24796.
llvm-svn: 280559
|
|
|
|
|
|
|
|
|
| |
Without reductions we do not need a flat union_map schedule describing
the computation we want to perform, but can work purely on the schedule
tree. This reduces the dependence computation and scheduling time from 33ms
to 25ms. Another 30% reduction.
llvm-svn: 280558
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case we do not compute reduction dependences or dependences that are more
fine-grained than statement level dependences, we can avoid the corresponding
part of the dependence analysis all together. For the 3mm benchmark, this
reduces scheduling + dependence analysis time from 62ms to 33ms for a no-asserts
build. The majority of the compile time is anyhow spent in the LLVM backends,
when doing code generation. Nevertheless, there is no need to waste compile time
either.
llvm-svn: 280557
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
block.
Clang tests for verifying the following syntaxes:
1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not.
0xNN and NNh may come with optional U or L suffix.
2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not.
NNb may come with optional U or L suffix.
Differential Revision: https://reviews.llvm.org/D22112
llvm-svn: 280556
|
|
|
|
|
|
|
|
|
|
|
| |
1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not.
0xNN and NNh may come with optional U or L suffix.
2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not.
NNb may come with optional U or L suffix.
Differential Revision: https://reviews.llvm.org/D22112
llvm-svn: 280555
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We replace the options
-polly-code-generator=none
=isl
with the options
-polly-code-generation=none
=ast
=full
This allows us to measure the overhead of Polly itself, versus the compile
time increases due to us generating more IR and consequently the LLVM backends
spending more time on this IR.
We also use this opportunity to rename the option. The original name was
introduced at a point where we still had two code generators. CLooG and the
isl AST generator. Since we only have one AST generator left, there is no need
to distinguish between 'isl' and something else. However, being able to disable
code generation all together has been shown useful for debugging. Hence, we
rename and extend this option to make it a good fit for its new use case.
llvm-svn: 280554
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows us to perform incompatible pointer conversions when
resolving overloads in C. So, the following code will no longer fail to
compile (though it will still emit warnings, assuming the user hasn't
opted out of them):
```
void foo(char *) __attribute__((overloadable));
void foo(int) __attribute__((overloadable));
void callFoo() {
unsigned char bar[128];
foo(bar); // selects the char* overload.
}
```
These conversions are ranked below all others, so:
A. Any other viable conversion will win out
B. If we had another incompatible pointer conversion in the example
above (e.g. `void foo(int *)`), we would complain about
an ambiguity.
Differential Revision: https://reviews.llvm.org/D24113
llvm-svn: 280553
|