| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
| |
We can always pick the passthru value if the mask is undef: we are
permitted to treat the mask as-if it were filled with zeros.
llvm-svn: 275379
|
| |
|
|
|
|
| |
instructions instead of creating CodeGenOnly instructions.
llvm-svn: 275378
|
| |
|
|
|
|
|
| |
Apparently someone miscounted the number of zeros in the immediate.
Fixes https://llvm.org/bugs/show_bug.cgi?id=28544 .
llvm-svn: 275376
|
| |
|
|
|
|
| |
Use the replacement pass to update the tests, and delete old names.
llvm-svn: 275375
|
| |
|
|
|
|
| |
Mesa removed this path, so nothing is using these anymore.
llvm-svn: 275372
|
| |
|
|
| |
llvm-svn: 275371
|
| |
|
|
| |
llvm-svn: 275369
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In this patch we implement the following parts of XRay:
- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.
There are some caveats here:
1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.
2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.
Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk
Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D19904
llvm-svn: 275367
|
| |
|
|
|
|
| |
Thanks to Eli for the suggestion!
llvm-svn: 275366
|
| |
|
|
|
|
|
|
|
|
|
| |
values are constant.
This now should also work with the interprocedural variant of the pass.
Slightly easier now that the yak is shaved.
Differential Revision: http://reviews.llvm.org/D22329
llvm-svn: 275363
|
| |
|
|
| |
llvm-svn: 275361
|
| |
|
|
|
|
| |
http://reviews.llvm.org/D22315
llvm-svn: 275360
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In Scalarizer::gather we see if we already have a scattered form of Op,
and in that case use the new form.
In the particular case of PR28108, the found ValueVector SV has size 2,
where the first Value is nullptr, and the second is indeed a proper Value.
The nullptr then caused an assert to blow when we tried to do
cast<Instruction>(SV[I]).
With this patch we check SV[I] before doing the cast, and if it's nullptr
we just skip over it.
I don't know the Scalarizer well enough to know if this is the best fix
or if something should be done else where to prevent the nullptr from
being in the ValueVector at all, but at least this avoids the crash
and looking at the test case output it looks reasonable.
Reviewers: hfinkel, frasercrmck, wala, mehdi_amini
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21518
llvm-svn: 275359
|
| |
|
|
| |
llvm-svn: 275357
|
| |
|
|
|
|
|
| |
While some code paths in MIRParserImpl::parse() already returned nullptr
in case of error one of the important ones did not.
llvm-svn: 275355
|
| |
|
|
|
|
|
|
|
|
| |
This adds Clang-specific DWARF constants for nullability and ObjC
class properties that are already generated by clang. This patch adds
dwarfdump support and a more comprehensive testcase.
<rdar://problem/27335745>
llvm-svn: 275354
|
| |
|
|
|
|
| |
Should fix the bots broken by r275316.
llvm-svn: 275353
|
| |
|
|
|
|
|
|
|
| |
We can constant fold a masked load if the operands are appropriately
constant.
Differential Revision: http://reviews.llvm.org/D22324
llvm-svn: 275352
|
| |
|
|
|
|
|
|
|
|
|
|
| |
TargetMachine.cpp
Avoid exposing a cl::opt in a public header and instead promote this
option in the API.
Alternatively, we could land the cl::opt in CommandFlags.h so that
it is available to every tool, but we would still have to find an
option for clang.
llvm-svn: 275348
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabled.
IPRA try to optimize caller saved register by propagating register
usage information from callee to caller so it is beneficial to have
caller saved registers compare to callee saved registers when IPRA
is enabled. Please find more detailed explanation here
https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/tjAJqb0eEgAJ.
This change makes local function do not have any callee preserved
register when IPRA is enabled. A simple test case is also added to
verify this change.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D21561
llvm-svn: 275347
|
| |
|
|
| |
llvm-svn: 275346
|
| |
|
|
|
|
|
|
|
| |
offsets
Treat loads which clip before the start of a global initializer the same
way we treat clipping beyond the end of the initializer: use zeros.
llvm-svn: 275345
|
| |
|
|
|
|
|
| |
This transform doesn't require any new instructions, it can safely live
in InstSimplify.
llvm-svn: 275344
|
| |
|
|
| |
llvm-svn: 275343
|
| |
|
|
|
|
|
|
| |
Code cleanup: Move references to SlotMapping and SourceMgr into the
PerFunctionMIParsingState to avoid unnecessary passing around in
parameters.
llvm-svn: 275342
|
| |
|
|
|
|
|
|
|
| |
If a masked loads is not added to the chain, it should not reset the chain's
root.
This fixes the remaining part of PR28515.
llvm-svn: 275340
|
| |
|
|
|
|
|
|
|
| |
The code was pretty much copy-pasted between SCCP and IPSCCP. The situation
became clearly worse after I introduced the support for folding structs in
SCCP. This commit is NFC as we currently (still) skip the replacement
step in IPSCCP, but I'll change this soon.
llvm-svn: 275339
|
| |
|
|
| |
llvm-svn: 275337
|
| |
|
|
| |
llvm-svn: 275335
|
| |
|
|
| |
llvm-svn: 275334
|
| |
|
|
|
|
|
| |
Use range-based for loops and llvm::any_of instead of explicit
iterators.
llvm-svn: 275332
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously it would say we had an invariant load if any of the memory
operands were invariant. But the load should be invariant only if *all*
the memory operands are invariant.
No testcase because this has proven to be very difficult to tickle in
practice. As just one example, ARM's ldrd instruction, which loads 64
bits into two 32-bit regs, is theoretically affected by this. But when
it's produced, it loses its memoperands' invariance bits!
Reviewers: jfb
Subscribers: llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D22318
llvm-svn: 275331
|
| |
|
|
|
|
|
|
|
|
| |
Code cleanup: The PerFunctionMIParsingState is per function, moving a
reference into PFS we can avoid passing around the MachineFunction in an
extra parameter most of the time.
Also change most signatures to consistently pass PFS reference first.
llvm-svn: 275329
|
| |
|
|
|
|
|
| |
In fact, don't even pass this to the ctor since we can get it from the
module.
llvm-svn: 275326
|
| |
|
|
| |
llvm-svn: 275325
|
| |
|
|
| |
llvm-svn: 275322
|
| |
|
|
|
|
|
|
|
| |
This happens to make X86CallFrameOptimization in -O0 / FastISel builds as well,
but it's not clear if the pass should run in that setup.
http://reviews.llvm.org/D22314
llvm-svn: 275320
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
LSV used to abort vectorizing a chain for interleaved load/store accesses that alias.
Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain.
Reviewers: llvm-commits, jlebar, arsenm
Subscribers: mzolotukhin
Differential Revision: http://reviews.llvm.org/D22119
llvm-svn: 275317
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See http://reviews.llvm.org/D22079
Changes the Archive::child_begin and Archive::children to require a reference
to an Error. If iterator increment fails (because the archive header is
damaged) the iterator will be set to 'end()', and the error stored in the
given Error&. The Error value should be checked by the user immediately after
the loop. E.g.:
Error Err;
for (auto &C : A->children(Err)) {
// Do something with archive child C.
}
// Check the error immediately after the loop.
if (Err)
return Err;
Failure to check the Error will result in an abort() when the Error goes out of
scope (as guaranteed by the Error class).
llvm-svn: 275316
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently the MIR framework prints all its outputs (errors and actual
representation) on stderr.
This patch fixes that by printing the regular output in the output
specified with -o.
Differential Revision: http://reviews.llvm.org/D22251
llvm-svn: 275314
|
| |
|
|
| |
llvm-svn: 275309
|
| |
|
|
| |
llvm-svn: 275308
|
| |
|
|
| |
llvm-svn: 275307
|
| |
|
|
| |
llvm-svn: 275304
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean
way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the
direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could
move that helper function to a higher level.
There is an open question as to which is of these forms should be considered the canonical IR:
%sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b
%shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3>
Differential Revision: http://reviews.llvm.org/D22114
llvm-svn: 275289
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
v2: don't count SGPRs spilled to scratch twice
I think this is sufficient. It doesn't count private memory usage, which
happens often and uses scratch but isn't technically a spill. The private
memory usage can be computed by:
[scratch_per_thread - vgpr_spills - a random multiple of SGPR spills].
The fact SGPR spills add very high numbers to the scratch size make that
computation a guessing game, but I don't have a solution to that.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D22197
llvm-svn: 275288
|
| |
|
|
|
|
|
|
| |
Patch by Sunita Marathe
Differential Revision: http://reviews.llvm.org/D21920
llvm-svn: 275284
|
| |
|
|
|
|
| |
This fixes http://llvm.org/PR28524
llvm-svn: 275278
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We know that pcmp produces all-ones/all-zeros bitmasks, so we can use that behavior to avoid unnecessary constant loading.
One could argue that load+and is actually a better solution for some CPUs (Intel big cores) because shifts don't have the
same throughput potential as load+and on those cores, but that should be handled as a CPU-specific later transformation if
it ever comes up. Removing the load is the more general x86 optimization. Note that the uneven usage of vpbroadcast in the
test cases is filed as PR28505:
https://llvm.org/bugs/show_bug.cgi?id=28505
Differential Revision: http://reviews.llvm.org/D22225
llvm-svn: 275276
|
| |
|
|
|
|
| |
This is a simplification, there should be no functional change.
llvm-svn: 275273
|