| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
This is used from llvm tblgen and the X86Disassembler - the only common
library (apart from TableGen, which probably doesn't make sense to have
as a dependency from a release tool (rather than a use-while-building-llvm
tool) of LLVM)
llvm-svn: 328393
|
| |
|
|
|
|
|
| |
It's implemented in Target & include from other Target headers, so the
header should be in Target.
llvm-svn: 328392
|
| |
|
|
|
|
|
| |
Both GCC and MSVC only look at the low byte of a boolean when it is
passed.
llvm-svn: 328386
|
| |
|
|
| |
llvm-svn: 328367
|
| |
|
|
| |
llvm-svn: 328366
|
| |
|
|
|
|
| |
This avoids unnecessary splitting due to uninteresting immediates.
llvm-svn: 328364
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The branch relaxation pass collects sizes of all instructions at the
beginning, before any changes have been made. It then performs one pass
over all branches to see which ones need to be extended. It does not
account for the case when a previously valid branch becomes out-of-range
due to relaxing other branches.
This approach fixes this problem by assuming from the beginning that
all extendable branches have been extended. This may cause unneeded
relaxation in some cases, but avoids iteration and recomputing instruction
sizes.
llvm-svn: 328360
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The HexagonExpandCondsets pass is incorrectly removing the dead
flag on a definition that is really dead, and adding a kill flag
to a use that is tied to a definition. This causes an assert later
during the machine scheduler when querying the live interval
information.
Patch by Brendon Cahoon.
llvm-svn: 328357
|
| |
|
|
| |
llvm-svn: 328356
|
| |
|
|
|
|
|
|
| |
Optimize Ry = add(Rx,#n); memw(Ry+#0) = Rz => memw(Rx,#n) = Rz.
Patch by Jyotsna Verma.
llvm-svn: 328355
|
| |
|
|
| |
llvm-svn: 328353
|
| |
|
|
|
|
| |
counterparts.
llvm-svn: 328352
|
| |
|
|
|
|
|
|
|
|
|
| |
attribute for AMDGPU
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS.
Differential Revision: https://reviews.llvm.org/D43736
llvm-svn: 328349
|
| |
|
|
|
|
|
|
|
|
|
|
| |
HexagonGenMux would collapse pairs of predicated transfers if it assumed
that the predicated .new forms cannot be created. Turns out that generating
mux is preferable in almost all cases.
Introduce an option -hexagon-gen-mux-threshold that controls the minimum
distance between the instruction defining the predicate and the later of
the two transfers. If the distance is closer than the threshold, mux will
not be generated. Set the threshold to 0 by default.
llvm-svn: 328346
|
| |
|
|
|
|
| |
Patch by Anand Kodnani.
llvm-svn: 328344
|
| |
|
|
|
|
| |
unit
llvm-svn: 328343
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode.
In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode,
adjustBBOffsetsAfter() is not calculating postOffset correctly by
properly accounting for the padding that is required for the constant pool
that immediately follows the jump table branch instruction.
Reviewers: t.p.northover, eli.friedman
Reviewed By: t.p.northover
Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D44709
llvm-svn: 328341
|
| |
|
|
|
|
|
|
|
| |
- Fix checking for vector predicate registers.
- Avoid speculating llvm.lifetime.end intrinsic.
Patch by Harsha Jagasia and Brendon Cahoon.
llvm-svn: 328339
|
| |
|
|
|
|
| |
Add missing non-VEX and (V)PMOVMSKB instructions to the pattern
llvm-svn: 328338
|
| |
|
|
|
|
|
|
|
| |
When converting an instruction to the wider version, copy any
subregisters if the original operand has a subregister.
Patch by Brendon Cahoon.
llvm-svn: 328333
|
| |
|
|
|
|
| |
function unit
llvm-svn: 328331
|
| |
|
|
|
|
| |
JSAGU/JSTC function units
llvm-svn: 328328
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds functions to allow MachineLICM to hoist invariant stores.
Currently, MachineLICM does not hoist any store instructions, however
when storing the same value to a constant spot on the stack, the store
instruction should be considered invariant and be hoisted. The function
isInvariantStore iterates each operand of the store instruction and checks
that each register operand satisfies isCallerPreservedPhysReg. The store
may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore.
This patch also adds the PowerPC changes needed to consider the stack
register as caller preserved.
Differential Revision: https://reviews.llvm.org/D40196
llvm-svn: 328326
|
| |
|
|
| |
llvm-svn: 328324
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loads and stores can only shift the offset register by the size of the value
being loaded, but currently the DAGCombiner will reduce the width of the load
if it's followed by a trunc making it impossible to later combine the shift.
Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and
make it prevent the width reduction if this is what would happen, though do
allow it if reducing the load width will let us eliminate a later sign or zero
extend.
Differential Revision: https://reviews.llvm.org/D44794
llvm-svn: 328321
|
| |
|
|
|
|
| |
This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops.
llvm-svn: 328320
|
| |
|
|
|
|
|
|
| |
correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units
Fixes throughput to match Agner/Fam16h-SoG as well.
llvm-svn: 328318
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When targeting execute-only and fp-armv8, float constants in a compare
resulted in instruction selection failures. This is now fixed by using
vmov.f32 where possible, otherwise the floating point constant is
lowered into a integer constant that is moved into a floating point
register.
This patch also restores using fpcmp with immediate 0 under fp-armv8.
Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
llvm-svn: 328313
|
| |
|
|
|
|
| |
Reviewed by @GGanesh and @craig.topper
llvm-svn: 328309
|
| |
|
|
|
|
|
|
| |
of 2 instregex entries
Found while updating D44687
llvm-svn: 328308
|
| |
|
|
|
|
| |
pipe and JFPX/JVALU function unit as well as the AGUs
llvm-svn: 328304
|
| |
|
|
|
|
|
|
|
| |
Patch by Simon Pilgrim <llvm-dev@redking.me.uk>
That is a slightly modified version of the AArch64 changes from
Simon's D44687 .
llvm-svn: 328303
|
| |
|
|
|
|
|
|
| |
Windows on arm is thumb only.
Differential Revision: https://reviews.llvm.org/D43005
llvm-svn: 328298
|
| |
|
|
| |
llvm-svn: 328296
|
| |
|
|
|
|
| |
Agner's data. Add missing MMX multiplies.
llvm-svn: 328295
|
| |
|
|
|
|
| |
Change pblendvb/blendvps/blendvpd to use WriteFVarBlend
llvm-svn: 328294
|
| |
|
|
| |
llvm-svn: 328293
|
| |
|
|
| |
llvm-svn: 328292
|
| |
|
|
|
|
| |
The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that.
llvm-svn: 328291
|
| |
|
|
|
|
| |
The SSE versions were present, but not the VEX version.
llvm-svn: 328290
|
| |
|
|
| |
llvm-svn: 328289
|
| |
|
|
|
|
| |
That removes some redundant recomputations from the passes pipeline.
llvm-svn: 328272
|
| |
|
|
|
|
| |
account for r328254
llvm-svn: 328260
|
| |
|
|
|
|
|
|
|
|
| |
VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD
This makes the Y position consistent with other instructions.
This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary.
llvm-svn: 328254
|
| |
|
|
|
|
|
|
| |
assigned Port015 instead of Port01.
The VEC ADD and VEC MUL units aren't present on port 5 on SkylakeClient.
llvm-svn: 328241
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This pass sinks COPY instructions into a successor block, if the COPY is not
used in the current block and the COPY is live-in to a single successor
(i.e., doesn't require the COPY to be duplicated). This avoids executing the
the copy on paths where their results aren't needed. This also exposes
additional opportunites for dead copy elimination and shrink wrapping.
These copies were either not handled by or are inserted after the MachineSink
pass. As an example of the former case, the MachineSink pass cannot sink
COPY instructions with allocatable source registers; for AArch64 these type
of copy instructions are frequently used to move function parameters (PhyReg)
into virtual registers in the entry block..
For the machine IR below, this pass will sink %w19 in the entry into its
successor (%bb.1) because %w19 is only live-in in %bb.1.
```
%bb.0:
%wzr = SUBSWri %w1, 1
%w19 = COPY %w0
Bcc 11, %bb.2
%bb.1:
Live Ins: %w19
BL @fun
%w0 = ADDWrr %w0, %w19
RET %w0
%bb.2:
%w0 = COPY %wzr
RET %w0
```
As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be
able to see %bb.0 as a candidate.
With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64.
Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz
Reviewed By: sebpop
Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D41463
llvm-svn: 328237
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As in SystemZ backend, correctly propagate node ids when inserting new
unselected nodes into the DAG during instruction Seleciton for X86
target.
Fixes PR36865.
Reviewers: jyknight, craig.topper
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D44797
llvm-svn: 328233
|
| |
|
|
|
|
| |
to as best as I understand how they are implemented.
llvm-svn: 328231
|
| |
|
|
|
|
| |
scheduled through the JFPU1 pipe
llvm-svn: 328226
|
| |
|
|
|
|
| |
The ymm instructions are double pumped as well.
llvm-svn: 328222
|