| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: I don't think this is necessary with i1 being illegal now.
Reviewers: RKSimon, zvi, guyblank
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38784
llvm-svn: 315469
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The software pipeliner and the packetizer try to break dependence
between the post-increment instruction and the dependent memory
instructions by changing the base register and the offset value.
However, in some cases, the existing logic didn't work properly
and created incorrect offset value.
Patch by Jyotsna Verma.
llvm-svn: 315468
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pipeliner is generating a serial sequence that causes poor
register allocation when a post-increment instruction appears
prior to the use of the post-increment register. This occurs when
there is a circular set of dependences involved with a sequence
of instructions in the same cycle. In this case, there is no
serialization of the parallel semantics that will not cause an
additional register to be allocated.
This patch fixes the problem by changing the instructions so that
the post-increment instruction is used by the subsequent
instruction, which enables the register allocator to make a
better decision and not require another register.
Patch by Brendon Cahoon.
llvm-svn: 315466
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eg:
insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7}
This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector.
We may want to abandon that one if we can't find value in squashing the more specific pattern sooner.
We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles.
There may be room for improvement in the shuffle lowering here, but that would be follow-up work.
Differential Revision: https://reviews.llvm.org/D38388
llvm-svn: 315460
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NumFixedArgs field of CallLoweringInfo is used by
TargetLowering::LowerCallTo to determine whether a given argument is passed
using the vararg calling convention or not (specifically, to set IsFixed for
each ISD::OutputArg).
Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both
incorrectly set NumFixedArgs based on the _previous_ args list. Secondly,
TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying
the argument list so a pointer is passed for the return value.
If your backend uses the IsFixed property or directly accesses NumFixedArgs,
it is _possible_ this change could result in codegen changes (although the
previous behaviour would have been incorrect). No such cases have been
identified during code review for any in-tree architecture.
Differential Revision: https://reviews.llvm.org/D37898
llvm-svn: 315457
|
| |
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38779
Patch by Chih-Mao Chen.
llvm-svn: 315455
|
| |
|
|
|
|
|
|
| |
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D38620
llvm-svn: 315451
|
| |
|
|
| |
llvm-svn: 315448
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This adds debug tracing to the table-generated assembly instruction matcher,
enabled by the -debug-only=asm-matcher option.
The changes in the target AsmParsers are to add an MCInstrInfo reference under
a consistent name, so that we can use it from table-generated code. This was
already being used this way for targets that use deprecation warnings, but 5
targets did not have it, and Hexagon had it under a different name to the other
backends.
llvm-svn: 315445
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
control flow to successors
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
int array[LEN];
...
guard(0 <= index && index < LEN);
use(array[index]);
Differential Revision: https://reviews.llvm.org/D37460
llvm-svn: 315440
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sinking of unordered atomic load into loop must be disallowed because it turns
a single load into multiple loads. The relevant section of the documentation
is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for
Optimizers section. Here is the full text of this section:
> Notes for optimizers
> In terms of the optimizer, this **prohibits any transformation that
> transforms a single load into multiple loads**, transforms a store into
> multiple stores, narrows a store, or stores a value which would not be
> stored otherwise. Some examples of unsafe optimizations are narrowing
> an assignment into a bitfield, rematerializing a load, and turning loads
> and stores into a memcpy call. Reordering unordered operations is safe,
> though, and optimizers should take advantage of that because unordered
> operations are common in languages that need them.
Patch by Daniil Suchkov!
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D38392
llvm-svn: 315438
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IRCE should not apply when the safe iteration range is proved to be empty.
In this case we do unneeded job creating pre/post loops and then never
go to the main loop.
This patch makes IRCE not apply to empty safe ranges, adds test for this
situation and also modifies one of existing tests where it used to happen
slightly.
Reviewed By: anna
Differential Revision: https://reviews.llvm.org/D38577
llvm-svn: 315437
|
| |
|
|
|
|
|
|
| |
This fixes PR34908. Patch by Alex Crichton!
Differential Revision: https://reviews.llvm.org/D38765
llvm-svn: 315429
|
| |
|
|
|
|
| |
Should fix mingw bot.
llvm-svn: 315413
|
| |
|
|
|
|
|
|
| |
MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that,
and allows us to remove another instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.
llvm-svn: 315410
|
| |
|
|
|
|
|
|
|
|
|
| |
Of course, casting an unsigned value too large for 'int' is UB. So,
write out the ternary. LLVM folds it to ADD anyway.
Fixes the warning from r303693 a different way.
Thanks to Erich Keane for pointing this out!
llvm-svn: 315406
|
| |
|
|
|
|
| |
We can just write directly to the raw_ostream.
llvm-svn: 315399
|
| |
|
|
| |
llvm-svn: 315395
|
| |
|
|
|
|
|
| |
Since r315388 we have a shorter way to say this, so we'll replace
MI->getParent()->getParent() with MI->getMF() in a few places.
llvm-svn: 315390
|
| |
|
|
|
|
|
|
| |
Similarly to how Instruction has getFunction, this adds a less verbose
way to write MI->getParent()->getParent(). I'll follow up shortly with
a change that changes a bunch of the uses.
llvm-svn: 315388
|
| |
|
|
|
|
| |
warnings; other minor fixes (NFC).
llvm-svn: 315383
|
| |
|
|
|
|
|
|
| |
broadcast and the load.
We already have these patterns for AVX512VL, but not AVX1 or 2.
llvm-svn: 315382
|
| |
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 315380
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FindFirstFileEx/FindNextFile results on Windows.
This allows clients to avoid an unnecessary fs::status() call on each
directory entry. Because the information returned by FindFirstFileEx
is a subset of the information returned by a regular status() call,
I needed to extract a base class from file_status that contains only
that information.
On my machine, this reduces the time required to enumerate a ThinLTO
cache directory containing 520k files from almost 4 minutes to less
than 2 seconds.
Differential Revision: https://reviews.llvm.org/D38716
llvm-svn: 315378
|
| |
|
|
|
|
|
| |
This forces every user to use the new create method that returns an
Expected. This in turn propagates better error messages.
llvm-svn: 315371
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation.
Reviewers: tejohnson, davidxl
Reviewed By: tejohnson
Subscribers: sanjoy, llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D38478
llvm-svn: 315369
|
| |
|
|
|
|
|
|
| |
also checking presence of BWI instructions.
The EVEX->VEX pass probably obscures this.
llvm-svn: 315365
|
| |
|
|
| |
llvm-svn: 315364
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than using the AdditionalPredicates mechanism to guard
the microMIPS instructions, use the existing predicates to properly
guard those instructions.
This also resolves a case where an instruction pattern was incorrectly
available for microMIPS32R6, which caused a register allocation failure
as the registers specified in the pattern were not available.
Reviewers: nitesh.jain, atanasyan
Differential Revision: https://reviews.llvm.org/D38451
llvm-svn: 315362
|
| |
|
|
| |
llvm-svn: 315361
|
| |
|
|
|
|
|
|
| |
opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass.
The heuristic in the custom selector for brcond deferred the
branch uniformity check to the pattern, which would fail.
llvm-svn: 315360
|
| |
|
|
|
|
| |
<rdar://problem/34689604>
llvm-svn: 315359
|
| |
|
|
|
|
| |
These should only be used if the machine structurizer is enabled.
llvm-svn: 315357
|
| |
|
|
| |
llvm-svn: 315354
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a post-linking pass which replaces the function pointer of enqueued
block kernel with a global variable (runtime handle) and adds
runtime-handle attribute to the enqueued block kernel.
In LLVM CodeGen the runtime-handle metadata will be translated to
RuntimeHandle metadata in code object. Runtime allocates a global buffer
for each kernel with RuntimeHandel metadata and saves the kernel address
required for the AQL packet into the buffer. __enqueue_kernel function
in device library knows that the invoke function pointer in the block
literal is actually runtime handle and loads the kernel address from it
and puts it into AQL packet for dispatching.
This cannot be done in FE since FE cannot create a unique global variable
with external linkage across LLVM modules. The global variable with internal
linkage does not work since optimization passes will try to replace loads
of the global variable with its initialization value.
Differential Revision: https://reviews.llvm.org/D38610
llvm-svn: 315352
|
| |
|
|
|
|
|
|
| |
This saves a call to stat().
Differential Revision: https://reviews.llvm.org/D38715
llvm-svn: 315351
|
| |
|
|
| |
llvm-svn: 315349
|
| |
|
|
|
|
|
| |
No functionality change, it just makes it easier to use Expected in
Object.
llvm-svn: 315348
|
| |
|
|
| |
llvm-svn: 315347
|
| |
|
|
| |
llvm-svn: 315335
|
| |
|
|
| |
llvm-svn: 315332
|
| |
|
|
| |
llvm-svn: 315331
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r315288. This is part of fixing segfault introduced
in:
http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/
llvm-svn: 315329
|
| |
|
|
|
|
|
| |
This reverts commit r315294. Part of fixing seg fault introduced in:
http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/
llvm-svn: 315328
|
| |
|
|
|
|
|
|
|
|
| |
functions.
This makes the ownership of the resulting MCObjectWriter clear, and allows us
to remove one instance of MCObjectStreamer's bizarre "holding ownership via
someone else's reference" trick.
llvm-svn: 315327
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The pass to fix function bitcasts generates thunks for functions that
are called directly with a mismatching signature. It was also generating
thunks in cases where the function was address-taken, causing aliasing
problems in otherwise valid cases.
This patch tightens the restrictions for when the pass runs.
Reviewers: sunfish, dschuff
Subscribers: jfb, sbc100, llvm-commits, aheejin
Differential Revision: https://reviews.llvm.org/D38640
llvm-svn: 315326
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add instruction definitions for FP32 mode for recip.d and rsqrt.d.
Previously these instructions were only defined when targeting the
full 64-bit FPU model but were not guarded properly.
Reviewers: nitesh.jain, atanasyan
Differential Revision: https://reviews.llvm.org/D38400
llvm-svn: 315318
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.
Recommit after being reverted in r315299.
Differential revision: https://reviews.llvm.org/D36993
llvm-svn: 315316
|
| |
|
|
|
|
|
|
|
| |
A number of record form instructions were missing from the P9 scheduling
model. Added those instructions and marked the P9 model as complete.
Differential Revision: https://reviews.llvm.org/D38560
llvm-svn: 315313
|
| |
|
|
|
| |
Change-Id: If6fe0b6ec01f111115fb734fe31c0e152dbc165f
llvm-svn: 315311
|