| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Also stop trying to insert skip blocks at end_cf. This
was inserting them at the end of the block which doesn't make
sense. The skip should be inserted at the beginning of the block
right after the end cf. Just remove this for now since no tests
seem to stress this and I think this can be handled more generally
later.
Fixes bug 28550
llvm-svn: 275510
|
| |
|
|
|
|
| |
Found while reducing bug 28550
llvm-svn: 275509
|
| |
|
|
| |
llvm-svn: 275508
|
| |
|
|
|
|
|
|
|
| |
If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now
only Kryo has both), set COPY (W|X)ZR isAsCheapAsAMove.
Differential Revision: http://reviews.llvm.org/D22360
llvm-svn: 275503
|
| |
|
|
|
|
|
|
| |
This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better.
This was incorrectly reverted in rL275421 during triage of PR28552.
llvm-svn: 275497
|
| |
|
|
|
|
| |
Yet again.
llvm-svn: 275463
|
| |
|
|
|
|
|
|
| |
On Hexagon is it legal to packetize the instructions setting up call
arguments with the call instruction itself. This was already done,
except for tail calls. Make sure tail calls are handled as well.
llvm-svn: 275458
|
| |
|
|
|
|
| |
Enable use-postra-scheduler. (NFC)
llvm-svn: 275457
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Make the target-specific flags in MachineMemOperand::Flags real, bona
fide enum values. This simplifies users, prevents various constants
from going out of sync, and avoids the false sense of security provided
by declaring static members in classes and then forgetting to define
them inside of cpp files.
Reviewers: MatzeB
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22372
llvm-svn: 275451
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only perform struct field check on Identifier tokens.
Fixes PR28547.
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22361
llvm-svn: 275445
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Give it a shorter name (because we're going to refer to it often from
SelectionDAG and friends).
- Split the flags and alignment into separate variables.
- Specialize FlagsEnumTraits for it, so we can do bitwise ops on it
without losing type information.
- Make some enum values constants in MachineMemOperand instead.
MOMaxBits should not be a valid Flag.
- Simplify some of the bitwise ops for dealing with Flags.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22281
llvm-svn: 275438
|
| |
|
|
|
|
| |
Typo meant we were only checking the low byte (repeatedly).
llvm-svn: 275437
|
| |
|
|
|
|
| |
https://reviews.llvm.org/D22362
llvm-svn: 275431
|
| |
|
|
| |
llvm-svn: 275429
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We were able to assemble, but not disassemble.
Note that fixupRMValue was truncating EA_REG_BND0-3 because we hit
the uint8_t max. The control registers were already squarely above
it, but I don't think they ever go in .r/m, only in .reg.
I also did notice an extra REX.W in our encoding, but I think that's
fine.
llvm-svn: 275427
|
| |
|
|
|
|
|
| |
Nothing in-tree can tell the difference, but it's incorrect: the
addressing mode registers aren't what's defined.
llvm-svn: 275426
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change fix bug 28538
Reviewers: tstellarAMD, vpykhtin
Subscribers: arsenm, kzhuravl
Differential Revision: https://reviews.llvm.org/D22355
llvm-svn: 275422
|
| |
|
|
| |
llvm-svn: 275421
|
| |
|
|
|
|
|
| |
stdcall is callee-pop like thiscall, so the thiscall changes already did most
of the work for this. This change only opts stdcall in and adds tests.
llvm-svn: 275414
|
| |
|
|
| |
llvm-svn: 275412
|
| |
|
|
|
|
| |
This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better.
llvm-svn: 275411
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It was recently discovered that, for Mips's SelectionDAGISel subclasses,
all optimization levels caused SelectionDAGISel to behave like -O2.
This change adds the necessary plumbing to initialize the optimization level.
Reviewers: andrew.w.kaylor
Subscribers: andrew.w.kaylor, sdardis, dean, llvm-commits, vradosavljevic, petarj, qcolombet, probinson, dsanders
Differential Revision: https://reviews.llvm.org/D14900
llvm-svn: 275410
|
| |
|
|
|
|
|
|
| |
64-bits to allow combining
Primarily this is to allow blend with zero instead of having to use vperm2f128, but we can use this in the future to deal with AVX512 cases where we need to keep the original element size to correctly fold masked operations.
llvm-svn: 275406
|
| |
|
|
| |
llvm-svn: 275400
|
| |
|
|
| |
llvm-svn: 275391
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.
Differential Revision: http://reviews.llvm.org/D21183
llvm-svn: 275382
|
| |
|
|
|
|
| |
instructions instead of creating CodeGenOnly instructions.
llvm-svn: 275378
|
| |
|
|
|
|
|
| |
Apparently someone miscounted the number of zeros in the immediate.
Fixes https://llvm.org/bugs/show_bug.cgi?id=28544 .
llvm-svn: 275376
|
| |
|
|
|
|
| |
Use the replacement pass to update the tests, and delete old names.
llvm-svn: 275375
|
| |
|
|
|
|
| |
Mesa removed this path, so nothing is using these anymore.
llvm-svn: 275372
|
| |
|
|
| |
llvm-svn: 275371
|
| |
|
|
| |
llvm-svn: 275369
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In this patch we implement the following parts of XRay:
- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.
There are some caveats here:
1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.
2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.
Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk
Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D19904
llvm-svn: 275367
|
| |
|
|
|
|
| |
http://reviews.llvm.org/D22315
llvm-svn: 275360
|
| |
|
|
|
|
|
|
|
|
|
|
| |
TargetMachine.cpp
Avoid exposing a cl::opt in a public header and instead promote this
option in the API.
Alternatively, we could land the cl::opt in CommandFlags.h so that
it is available to every tool, but we would still have to find an
option for clang.
llvm-svn: 275348
|
| |
|
|
|
|
|
|
|
| |
This happens to make X86CallFrameOptimization in -O0 / FastISel builds as well,
but it's not clear if the pass should run in that setup.
http://reviews.llvm.org/D22314
llvm-svn: 275320
|
| |
|
|
| |
llvm-svn: 275309
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
v2: don't count SGPRs spilled to scratch twice
I think this is sufficient. It doesn't count private memory usage, which
happens often and uses scratch but isn't technically a spill. The private
memory usage can be computed by:
[scratch_per_thread - vgpr_spills - a random multiple of SGPR spills].
The fact SGPR spills add very high numbers to the scratch size make that
computation a guessing game, but I don't have a solution to that.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D22197
llvm-svn: 275288
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We know that pcmp produces all-ones/all-zeros bitmasks, so we can use that behavior to avoid unnecessary constant loading.
One could argue that load+and is actually a better solution for some CPUs (Intel big cores) because shifts don't have the
same throughput potential as load+and on those cores, but that should be handled as a CPU-specific later transformation if
it ever comes up. Removing the load is the more general x86 optimization. Note that the uneven usage of vpbroadcast in the
test cases is filed as PR28505:
https://llvm.org/bugs/show_bug.cgi?id=28505
Differential Revision: http://reviews.llvm.org/D22225
llvm-svn: 275276
|
| |
|
|
| |
llvm-svn: 275272
|
| |
|
|
|
|
|
|
| |
shuffle mask
Added AVX512F VPERMILPS shuffle decoding support
llvm-svn: 275270
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21484
llvm-svn: 275268
|
| |
|
|
|
|
| |
Removes a return-on-fail that was making it tricky to add other variable mask shuffles.
llvm-svn: 275262
|
| |
|
|
| |
llvm-svn: 275253
|
| |
|
|
|
|
|
|
| |
- Add new TTI instruction checks
- Don't use const for blocks that are mutated.
- Checking isBranch and isTerminator should be redundant
llvm-svn: 275252
|
| |
|
|
|
|
|
|
| |
zext/sext with 256-bit source types producing a 256-bit result.
These patterns just extracted the source down to 128-bits to use the instructions. AVX512 seems to have blindly copied them over for VLX, but did not create similar patterns for 512-bit sources. So I'm hoping the backend can't actually produce these cases.
llvm-svn: 275240
|
| |
|
|
|
|
| |
I meant to squash this into it.
llvm-svn: 275220
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
http://reviews.llvm.org/D20239
It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that
are useful in some cases for inserting and extracting vector elements of
v4[if]32 vectors.
llvm-svn: 275215
|
| |
|
|
| |
llvm-svn: 275212
|
| |
|
|
|
|
|
|
|
|
| |
With r274952 and r275201 in place there are no cases left where a
forward liveness analysis yields different results than a backward one.
So we can remove the forward stepping logic.
Differential Revision: http://reviews.llvm.org/D22083
llvm-svn: 275204
|