| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
This commit updates the documentation comments in PseudoSourceValue.cpp and
PseudoSourceValue.h based on the LLVM's documentation style. It also fixes
several instances of variable names that started with a lowercase letter.
This change is done in preparation for the changes to the pseudo source value
object management and to the PseudoSourceValue's class hierarchy.
llvm-svn: 244686
|
|
|
|
|
|
|
|
|
| |
This commit reformats the files lib/CodeGen/PseudoSourceValue.cpp and
include/llvm/CodeGen/PseudoSourceValue.h using clang-format. This change is
done in preparation for the changes to the pseudo source value object
management and to the PseudoSourceValue's class hierarchy.
llvm-svn: 244685
|
|
|
|
|
|
|
|
|
| |
For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor
fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason
is that many index computations are carried out in 64-bits but never actually exceed the
capacity of a 32-bit word.
llvm-svn: 244684
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mangled "linkage" names can be huge, and if the debugger (or other
tools) have no use for them, the size savings can be very impressive
(on the order of 40%).
Add one test for controlling behavior, and modify a number of tests to
either stop using linkage names, or make llc emit them (so these tests
will still run when the default triple is for PS4).
Differential Revision: http://reviews.llvm.org/D11374
llvm-svn: 244678
|
|
|
|
|
|
|
|
|
|
| |
`InstCombiner::OptimizeOverflowCheck` was asserting an
invariant (operands to binary operations are ordered by decreasing
complexity) that wasn't really an invariant. Fix this by instead having
`InstCombiner::OptimizeOverflowCheck` establish the invariant if it does
not hold.
llvm-svn: 244676
|
|
|
|
| |
llvm-svn: 244672
|
|
|
|
| |
llvm-svn: 244668
|
|
|
|
|
|
|
|
| |
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11959
llvm-svn: 244667
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the FP comparisons (ueq, one, ult, ule, ugt, uge) are currently broken, I'll fix them in a follow-up.
Reviewers: sunfish
Subscribers: llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D11924
llvm-svn: 244665
|
|
|
|
|
|
| |
single/double multiplies
llvm-svn: 244657
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds check for dead blocks and skip them for processSwitchInst(). This will help reduce compilation time.
Reviewers: reames, hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11953
llvm-svn: 244656
|
|
|
|
|
|
|
|
|
|
| |
Summary: Implementation is the same as in AArch64.
Subscribers: aemerson, jfb, llvm-commits, sunfish
Differential Revision: http://reviews.llvm.org/D11956
llvm-svn: 244655
|
|
|
|
|
|
| |
Also, add a test for optsize because this was not part of any existing regression test.
llvm-svn: 244651
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For example:
s6 = s0*s5;
s2 = s6*s6 + s6;
...
s4 = s6*s3;
We notice that it is possible for s2 is folded to fma (s0, s5, fmul (s6 s6)).
This only happens when Aggressive is true, otherwise hasOneUse() check
already prevents from folding the multiplication with more uses.
Test Plan: test/CodeGen/NVPTX/fma-assoc.ll
Patch by Xuetian Weng
Reviewers: hfinkel, apazos, jingyue, ohsallen, arsenm
Subscribers: arsenm, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D11855
llvm-svn: 244649
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks.
Reviewers: kariddi, resistor, hans, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11852
llvm-svn: 244642
|
|
|
|
| |
llvm-svn: 244641
|
|
|
|
| |
llvm-svn: 244631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For LTO we need to enable this pass in the LTO pipeline,
as it is skipped during the "-flto -c" compile step (when PrepareForLTO is
set).
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11919
llvm-svn: 244622
|
|
|
|
| |
llvm-svn: 244618
|
|
|
|
| |
llvm-svn: 244617
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Other objects can never reference the MergedGlobals symbol so external linkage
is never needed. Using private instead of internal linkage means the object is
more similar to what it looks like when global merging is not enabled, with
the only difference being that the merged variables are addressed indirectly
relative to the start of the section they are in.
Also add aliases for merged variables with internal linkage, as this also makes
the object be more like what it is when they are not merged.
Differential Revision: http://reviews.llvm.org/D11942
llvm-svn: 244615
|
|
|
|
| |
llvm-svn: 244610
|
|
|
|
| |
llvm-svn: 244609
|
|
|
|
| |
llvm-svn: 244607
|
|
|
|
| |
llvm-svn: 244604
|
|
|
|
|
|
|
|
|
|
|
| |
First step in preventing immediates that occur more than once within a single
basic block from being pulled into their users, in order to prevent unnecessary
large instruction encoding .Currently enabled only when optimizing for size.
Patch by: zia.ansari@intel.com
Differential Revision: http://reviews.llvm.org/D11363
llvm-svn: 244601
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsic.
Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change:
- Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant.
- f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall!
llvm-svn: 244595
|
|
|
|
|
|
| |
NFCI. This just removes custom ISDNodes that are no longer needed.
llvm-svn: 244594
|
|
|
|
|
|
|
|
| |
Lower Intrinsic::arm_neon_vmins/vmaxs to fminnan/fmaxnan and match that instead. This is important because SDAG will soon be able to select FMINNAN itself, so we need a unified lowering path for intrinsics and SDAG.
NFCI.
llvm-svn: 244593
|
|
|
|
|
|
|
|
| |
Lower the intrinsic to a FMINNUM/FMAXNUM node and select that instead. This is important because soon SDAG will be able to select FMINNUM/FMAXNUM itself, so we need an integrated lowering path between SDAG and intrinsics.
NFCI.
llvm-svn: 244592
|
|
|
|
|
|
| |
NFCI. This replaces another custom ISDNode with a generic equivalent.
llvm-svn: 244591
|
|
|
|
|
|
| |
NFCI. This removes a custom ISDNode.
llvm-svn: 244590
|
|
|
|
|
|
|
|
| |
SAL and SHL instructions perform the same operation
Differential Revision: http://reviews.llvm.org/D11882
llvm-svn: 244588
|
|
|
|
|
|
|
|
|
| |
REPE, REPZ, REPNZ, REPNE should have mnemonics for Intel syntax as well.
Currently using these instructions causes compilation errors for Intel syntax.
Differential Revision: http://reviews.llvm.org/D11794
llvm-svn: 244584
|
|
|
|
|
|
|
|
|
| |
The "imul reg, imm" alias is not defined for intel syntax.
In intel syntax there is no w/l/q suffix for the imul instruction.
Differential Revision: http://reviews.llvm.org/D11887
llvm-svn: 244582
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intention of these is to be a corollary to ISD::FMINNUM/FMAXNUM,
differing only on how NaNs are treated. FMINNUM returns the non-NaN
input (when given one NaN and one non-NaN), FMINNAN returns the NaN
input instead.
This patch includes support for scalarizing, widening and splitting
vectors, but not expansion or softening. The reason is that these
should never be needed - FMINNAN nodes are only going to be created
in one place (SDAGBuilder::visitSelect) and there we'll check if the
node is legal or custom. I could preemptively add expand and soften
code, but I'm fairly opposed to adding code I can't test. It's bad
enough I can't create tests with this patch, but at least this code
will be exercised by the ARM and AArch64 backends fairly shortly.
llvm-svn: 244581
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.
matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.
C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.
ARM and AArch64 at least have different instructions for these different cases.
llvm-svn: 244580
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch remaps the assembly idiom 'move' to 'or' instead of 'daddu' or
'addu'. The use of addu/daddu instead of or as move was highlighted as a
performance issue during the analysis of a recent 64bit design. Originally
move was encoded as 'or' by binutils but was changed for the r10k cpu family
due to their pipeline which had 2 arithmetic units and a single logical unit,
and so could issue multiple (d)addu based moves at the same time but only 1
logical move.
This patch preserves the disassembly behaviour so that disassembling a old style
(d)addu move still appears as move, but assembling move always gives an or
Patch by Simon Dardis.
Reviewers: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11796
llvm-svn: 244579
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When optimizing for size, replace "addl $4, %esp" and "addl $8, %esp"
following a call by one or two pops, respectively. We don't try to do it in
general, but only when the stack adjustment immediately follows a call - which
is the most common case.
That allows taking a short-cut when trying to find a free register to pop into,
instead of a full-blown liveness check. If the adjustment immediately follows a
call, then every register the call clobbers but doesn't define should be dead at
that point, and can be used.
Differential Revision: http://reviews.llvm.org/D11749
llvm-svn: 244578
|
|
|
|
|
|
|
|
|
|
| |
The condition for clearing the folding candidate list was clamped together
with the "uninteresting instruction" condition. This is too conservative,
e.g. we don't need to clear the list when encountering an IMPLICIT_DEF.
Differential Revision: http://reviews.llvm.org/D11591
llvm-svn: 244577
|
|
|
|
|
|
|
|
|
|
| |
lookups
Should hopefully fix the remainder of PR24288.
Differential Revision: http://reviews.llvm.org/D11900
llvm-svn: 244575
|
|
|
|
|
|
| |
relocationValueRef uses the addend, so it has to be set before the call.
llvm-svn: 244574
|
|
|
|
| |
llvm-svn: 244571
|
|
|
|
|
|
|
|
|
|
| |
Summary: Caused by D11914, pointed out by blaikie.
Subscribers: llvm-commits, jfb, dblaikie
Differential Revision: http://reviews.llvm.org/D11929
llvm-svn: 244570
|
|
|
|
|
|
|
| |
Make sure that an EH pad's predecessors are using their unwind edge to
transfer control to the EH pad.
llvm-svn: 244563
|
|
|
|
|
|
|
|
|
|
| |
Summary: I somehow forgot to add these when I added the basic floating-point opcodes. Also remove ceil/floor/trunc/nearestint for now, and add them only when properly tested.
Subscribers: llvm-commits, sunfish, jfb
Differential Revision: http://reviews.llvm.org/D11927
llvm-svn: 244562
|
|
|
|
| |
llvm-svn: 244559
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds somewhat basic preparation functionality including:
- Formation of funclets via coloring basic blocks.
- Cloning of polychromatic blocks to ensure that funclets have unique
program counters.
- Demotion of values used between different funclets.
- Some amount of cleanup once we have removed predecessors from basic
blocks.
- Verification that we are left with a CFG that makes some amount of
sense.
N.B. Arguments and numbering still need to be done.
Differential Revision: http://reviews.llvm.org/D11750
llvm-svn: 244558
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MI.clear() within MCD::OPC_Decode case and inside of translateInstruction() for the X86 target. Remove now unnecessary MI.clear() from ARMDisassembler.
Summary: Explicitly clear the MI operand list when getInstruction() is called.
Reviewers: hfinkel, t.p.northover, hvarga, kparzysz, jyknight, qcolombet, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11665
llvm-svn: 244557
|
|
|
|
|
|
| |
This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints.
llvm-svn: 244555
|