| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and extload has multi users
In function DAGCombiner::visitSIGN_EXTEND_INREG, sext can be combined with extload even if sextload is not supported by target, then
if sext is the only user of extload, there is no big difference, no harm no benefit.
if extload has more than one user, the combined sextload may block extload from combining with other zext, causes extra zext instructions generated. As demonstrated by the attached test case.
This patch add the constraint that when sextload is not supported by target, sext can only be combined with extload if it is the only user of extload.
Differential Revision: https://reviews.llvm.org/D39108
llvm-svn: 316802
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were handling the non-hidden case in lib/Target/TargetMachine.cpp,
but the hidden case was handled in architecture dependent code and
only X86_64 and AArch64 were covered.
While it is true that some code sequences in some ABIs might be able
to produce the correct value at runtime, that doesn't seem to be the
common case.
I left the AArch64 code in place since it also forces a got access for
non-pic code. It is not clear if that is needed, but it is probably
better to change that in another commit.
llvm-svn: 316799
|
| |
|
|
|
|
| |
of i16 and i32/i64 are only tested by larger tests.
llvm-svn: 316796
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
ValueTracking was recognizing not all variations of clamp. Swapping of
true value and false value of select was added to fix this problem. The
first patch was reverted because it caused miscompile in NVPTX target.
Added corresponding test cases.
Reviewers: spatel, majnemer, efriedma, reames
Subscribers: llvm-commits, jholewinski
Differential Revision: https://reviews.llvm.org/D39240
llvm-svn: 316795
|
| |
|
|
| |
llvm-svn: 316789
|
| |
|
|
|
|
|
| |
Making sure that an instruction has fewer operands than required, then
attempting to access one out of range is going to fail.
llvm-svn: 316785
|
| |
|
|
|
|
| |
We should be able to do this by re-materializing an all-bits vector and then blending with it
llvm-svn: 316779
|
| |
|
|
| |
llvm-svn: 316763
|
| |
|
|
| |
llvm-svn: 316753
|
| |
|
|
| |
llvm-svn: 316749
|
| |
|
|
|
|
|
|
|
|
|
| |
Not having the subclass data on an MemIntrinsicSDNodes means it was possible
to try to fold 2 nodes with the same operands but differing MMO flags. This
would trip an assertion when trying to refine the alignment between the 2
MachineMemOperands.
Differential Revision: https://reviews.llvm.org/D38898
llvm-svn: 316737
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As far as I can tell, this matches gcc: -mfloat-abi determines the
calling convention for all functions except those explicitly defined as
soft-float in the ARM RTABI.
This change only affects cases where the user specifies -mfloat-abi to
override the default calling convention derived from the target triple.
Fixes https://bugs.llvm.org//show_bug.cgi?id=34530.
Differential Revision: https://reviews.llvm.org/D38299
llvm-svn: 316708
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
64-bit extensions.
If the extend type is 64-bits, emit a 32-bit -> 64-bit extend after the UDIVREM8_ZEXT_HREG/UDIVREM8_SEXT_HREG operation.
This gives a shorter encoding for the second extend in the sext case, and allows us to completely remove the second extend in the zext case.
This also adds known bit and num sign bits support for UDIVREM8_ZEXT_HREG/SDIVREM8_SEXT_HREG.
Differential Revision: https://reviews.llvm.org/D38275
llvm-svn: 316702
|
| |
|
|
|
|
|
|
| |
Instead of loading (a potential ton of) scalar constants, load those as a vector and then insert into it.
Differential Revision: https://reviews.llvm.org/D38756
llvm-svn: 316685
|
| |
|
|
|
|
| |
This should have been committed with memory model implementation
llvm-svn: 316680
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we do not represent runtime preemption in the IR, which has several
drawbacks:
1) The semantics of GlobalValues differ depending on the object file format
you are targeting (as well as the relocation-model and -fPIE value).
2) We have no way of disabling inlining of run time interposable functions,
since in the IR we only know if a function is link-time interposable.
Because of this llvm cannot support elf-interposition semantics.
3) In LTO builds of executables we will have extra knowledge that a symbol
resolved to a local definition and can't be preemptable, but have no way to
propagate that knowledge through the compiler.
This patch adds preemptability specifiers to the IR with the following meaning:
dso_local --> means the compiler may assume the symbol will resolve to a
definition within the current linkage unit and the symbol may be accessed
directly even if the definition is not within this compilation unit.
dso_preemptable --> means that the compiler must assume the GlobalValue may be
replaced with a definition from outside the current linkage unit at runtime.
To ease transitioning dso_preemptable is treated as a 'default' in that
low-level codegen will still do the same checks it did previously to see if a
symbol should be accessed indirectly. Eventually when IR producers emit the
specifiers on all Globalvalues we can change dso_preemptable to mean 'always
access indirectly', and remove the current logic.
Differential Revision: https://reviews.llvm.org/D20217
llvm-svn: 316668
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D39171
llvm-svn: 316666
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PR35071 exposed the fact that MipsInstrInfo::removeBranch did not walk past
debug instructions when removing branches for the control flow optimizer, which
lead to duplicated conditional branches. If the target of the branch was a
removable block, only the conditional branch in the terminating position would
have it's MBB operands updated, leaving the first branch with a dangling MBB
operand. The MIPS long branch pass would then trigger an assertion when
attempting to examine the instruction with dangling MBB operand.
This resolves PR35071.
Thanks to Alex Richardson for reporting the issue!
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D39288
llvm-svn: 316654
|
| |
|
|
|
|
|
|
|
|
|
| |
Greater-or-Equal 1
Currently a record-form instruction is used for comparison of "greater than -1" and "less than 1" by modifying the predicate (e.g. LT 1 into LE 0) in addition to the naive case of comparison against 0.
This patch also enables emitting a record-form instruction for "less than or equal to -1" (i.e. "less than 0") and "greater than or equal to 1" (i.e. "greater than 0") to increase the optimization opportunities.
Differential Revision: https://reviews.llvm.org/D38941
llvm-svn: 316647
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
On FreeBSD11.0 the FileCheck NOT string "1.0" will be matched by
`.amd_amdgpu_isa "amdgcn-unknown-freebsd11.0--gfx802"` at the end of the
file. Add a CHECK for that directive to avoid failing the test.
Reviewers: rampitec, kzhuravl
Reviewed By: rampitec, kzhuravl
Subscribers: emaste, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, krytarowski
Differential Revision: https://reviews.llvm.org/D39306
llvm-svn: 316616
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
In getOffsetRange, Max can be set to 0 to force the extender replacement
to be at or below the original value. This would cause the new offset to
be non-negative, which is preferred for memory instructions (to reduce
the likelihood of it getting constant-extended due to predication). The
problem happens when the range is shifted by an offset (present in the
instruction being examined) and the offset is negative. The entire range
for the allowable deviation will then be strictly negative. This creates
a problem, since 0 is assumed to be a valid deviation.
llvm-svn: 316601
|
| |
|
|
| |
llvm-svn: 316590
|
| |
|
|
|
|
|
| |
- memory-legalizer-atomic-load.ll -> memory-legalizer-load.ll
- memory-legalizer-atomic-store.ll -> memory-legalizer-store.ll
llvm-svn: 316586
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
overflows signed char
In a case when number of output constraint operands that has matched input operands
doesn't fit to signed char, TargetLowering::ParseConstraints() can try to access
ConstraintOperands (that is std::vector) with negative index.
Reviewers: rampitec, arsenm
Differential Review: https://reviews.llvm.org/D39125
llvm-svn: 316574
|
| |
|
|
|
|
|
|
| |
Remove the G_FADD testcases from arm-legalizer.mir, they are covered by
arm-legalizer-fp.mir (I probably forgot to delete them when I created
that test).
llvm-svn: 316573
|
| |
|
|
|
|
|
| |
No need to check register classes in the register block anymore, since
we can now much more conveniently check them at their def.
llvm-svn: 316572
|
| |
|
|
|
|
|
|
| |
We were generating BLX for all the calls, which was incorrect in most
cases. Update ARMCallLowering to generate BL for direct calls, and BLX,
BX_CALL or BMOVPCRX_CALL for indirect calls.
llvm-svn: 316570
|
| |
|
|
|
|
|
|
|
|
|
| |
Separate the test cases that deal with calls from the rest of the IR
Translator tests.
We split into 2 different files, one for testing parameter and result
lowering, and one for testing the various different kinds of calls that
can occur (BL, BLX, BX_CALL etc).
llvm-svn: 316569
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
(1)"
Compute the actual decomposition only after deciding whether to expand
of not. Else, it's easy to make the compiler OOM with:
`memcpy(dst, src, 0xffffffffffffffff);`, which typically happens if
someone mistakenly passes a negative value. Add a test.
This reverts commit f8fc02fbd4ab33383c010d33675acf9763d0bd44.
llvm-svn: 316567
|
| |
|
|
|
|
|
|
|
|
| |
Swap the compare operands if the lhs is a shift and the rhs isn't,
as in arm and T2 the shift can be performed by the compare for its
second operand.
Differential Revision: https://reviews.llvm.org/D39004
llvm-svn: 316562
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the dllimport attribute did the right thing in terms
of treating it as a pointer to a value, but this makes sure the
names get mangled properly, and calls to such functions load the
function from the __imp_ pointer.
This is based on SVN r212431 and r212430 where the same was
implemented for ARM.
Differential Revision: https://reviews.llvm.org/D38530
llvm-svn: 316555
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code added in r297930 assumed that it could create
a select with a condition type that is just an integer
bitcast of the selected type. For AMDGPU any vselect is
going to be scalarized (although the vector types are legal),
and all select conditions must be i1 (the same as getSetCCResultType).
This logic doesn't really make sense to me, but there's
never really been a consistent policy in what the select
condition mask type is supposed to be. Try to extend
the logic for skipping the transform for condition types
that aren't setccs. It doesn't seem quite right to me though,
but checking conditions that seem more sensible (like whether the
vselect is going to be expanded) doesn't work since this
seems to depend on that also.
llvm-svn: 316554
|
| |
|
|
|
|
|
|
|
|
| |
If particular target supports volatile memory access operations, we can
avoid AS casting to generic AS. Currently it's only enabled in NVPTX for
loads and stores that access global & shared AS.
Differential Revision: https://reviews.llvm.org/D39026
llvm-svn: 316495
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Broadwell CPU.
Adding the scheduling information for the Browadwell (BDW) CPU target.
This patch adds the instruction scheduling information for the Broadwell (BDW) architecture target by adding the file X86SchedBroadwell.td located under the X86 Target.
We used the scheduling information retrieved from the Broadwell architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each BDW instruction.
The patch continues the scheduling replacement and insertion effort started with the SandyBridge (SNB) target in r310792, the Haswell (HSW) target in r311879, the SkylakeClient (SKL) target in rL313613 + rL315978 and the SkylakeServer (SKX) in rL315175.
Performance fluctuations may be expected due to code alignment effects.
Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D39054
Change-Id: If6f799e5ff60e1091c8d43b05ea78c53581bae01
llvm-svn: 316492
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This updates the MIRPrinter to include the regclass when printing
virtual register defs, which is already valid syntax for the
parser. That is, given 64 bit %0 and %1 in a "gpr" regbank,
%1(s64) = COPY %0(s64)
would now be written as
%1:gpr(s64) = COPY %0(s64)
While this change alone introduces a bit of redundancy with the
registers block, it allows us to update the tests to be more concise
and understandable and brings us closer to being able to remove the
registers block completely.
Note: We generally only print the class in defs, but there is one
exception. If there are uses without any defs whatsoever, we'll print
the class on all uses. I'm not completely convinced this comes up in
meaningful machine IR, but for now the MIRParser and MachineVerifier
both accept that kind of stuff, so we don't want to have a situation
where we can print something we can't parse.
llvm-svn: 316479
|
| |
|
|
|
|
|
|
|
|
|
|
| |
If we have the situation where a Swap feeds a Splat we can sometimes change the
index on the Splat and then remove the Swap instruction.
Fixed the test case that was failing and recommit after pulling the original
commit.
Original revision is here: https://reviews.llvm.org/D39009
llvm-svn: 316478
|
| |
|
|
| |
llvm-svn: 316462
|
| |
|
|
| |
llvm-svn: 316457
|
| |
|
|
|
|
|
|
| |
just PACKSSWB.
By using the widest type possible for PACKSS truncation we have a better chance of being able to peek through bitcasts and improves other combines driven by ComputeNumSignBits.
llvm-svn: 316448
|
| |
|
|
|
|
| |
...because every swiss cheese has different holes.
llvm-svn: 316446
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D39051
llvm-svn: 316435
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r264440 added or/and patterns for storing -1 or 0 with the intention of decreasing code size. However,
X86CallFrameOptimization does not recognize these memory accesses so it will not replace them with push's when profitable.
This patch fixes this problem by teaching X86CallFrameOptimization these store 0/-1 idioms.
An alternative fix would be to prevent the 'store 0/1 idioms' patterns from firing when accessing the stack. This would save
the need to teach the pass about these idioms. However, because X86CallFrameOptimization does not always fire we may result
in cases where neither X86CallFrameOptimization not the patterns for 'store 0/1 idioms' fire.
Fixes pr34863
Reviewers: DavidKreitzer, guyblank, aymanmus
Reviewed By: aymanmus
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38738
llvm-svn: 316431
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.
Also allow kill in all shader stages.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D38544
llvm-svn: 316427
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D38543
llvm-svn: 316426
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SelectionDAG inserts a copy of ESP into a virtual register.
X86CallFrameOptimization assumed that the COPY, if present, is always
right after the call-frame setup instruction (ADJCALLSTACKDOWN). This was a
wrong assumption as the COPY can be located anywhere between the call-frame setup
instruction and its first use. If the COPY happened to be located in a different
location than what X86CallFrameOptimization assumed, visiting it while
processing the call chain would lead to a conservative bail-out.
The fix is quite straightfoward, scan ahead for the stack-pointer copy and make note
of it so it can be ignored while processing the call chain.
Fixes pr34903
Differential Revision: https://reviews.llvm.org/D38730
llvm-svn: 316416
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved.
The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known.
This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding.
Reviewers:
aaboud
zvi
craig.topper
gadi.haber
Differential revision: https://reviews.llvm.org/D34393
Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e
llvm-svn: 316413
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The motivation of this change is to enable .mir testing for this pass.
Added one test case to cover the functionality, this same case will be improved by
a future patch.
Reviewers: igorb, guyblank, DavidKreitzer
Reviewed By: guyblank, DavidKreitzer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38729
llvm-svn: 316412
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds optimisation remarks for outlining which fire when a function
is successfully outlined.
To do this, OutlinedFunctions must now contain references to their Candidates.
Since the Candidates must still be sorted and worked on separately, this is
done by working on everything in terms of shared_ptrs to Candidates. This is
good; it means that we can easily move everything to outlining in terms of
the OutlinedFunctions rather than the individual Candidates. This is far more
intuitive than what's currently there!
(Remarks are output when a function is created for some group of Candidates.
In a later commit, all of the outlining logic should be rewritten so that we
loop over OutlinedFunctions rather than over Candidates.)
llvm-svn: 316396
|
| |
|
|
|
|
|
| |
This is in preparation for a verifier check that makes sure
copies are of the same size (when generic virtual registers are involved).
llvm-svn: 316388
|
| |
|
|
|
|
|
| |
This is in preparation for a verifier check that makes sure copies are
of the same size (when generic virtual registers are involved).
llvm-svn: 316387
|