| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves.
llvm-svn: 221336
|
| |
|
|
| |
llvm-svn: 221333
|
| |
|
|
|
|
|
|
|
| |
Commit 220932 caused crash when building clang-tblgen on aarch64 debian target,
so it's blocking all daily tests.
The std::call_once implementation in pthread has bug for aarch64 debian.
llvm-svn: 221331
|
| |
|
|
| |
llvm-svn: 221328
|
| |
|
|
|
|
|
|
|
| |
Exact shifts may not shift out any non-zero bits. Use computeKnownBits
to determine when this occurs and just return the left hand side.
This fixes PR21477.
llvm-svn: 221325
|
| |
|
|
|
|
|
|
|
|
| |
We currently try to push an even number of registers to preserve 8-byte
alignment during a function's prologue, but only when the stack alignment is
prcisely 8. Many of the reasons for doing it apply also when that alignment > 8
(the extra store is often free, and can save another stack adjustment, though
less frequently for 16-byte stack alignment).
llvm-svn: 221321
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were making an attempt to do this by adding an extra callee-saved GPR (so
that there was an even number in the list), but when that failed we went ahead
and pushed anyway.
This had a couple of potential issues:
+ The .cfi directives we emit misplaced dN because they were based on
PrologEpilogInserter's calculation.
+ Unaligned stores can be less efficient.
+ Unaligned stores can actually fault (likely only an issue in niche cases,
but possible).
This adds a final explicit stack adjustment if all other options fail, so that
the actual locations of the registers match up with where they should be.
llvm-svn: 221320
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Divides and remainder operations do not behave like other operations
when they are given poison: they turn into undefined behavior.
It's really hard to know if the operands going into a div are or are not
poison. Because of this, we should only choose to speculate if there
are constant operands which we can easily reason about.
This fixes PR21412.
llvm-svn: 221318
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r221171.
It performs this invalid transformation:
- %div.i = urem i64 -1, %add
- %sub.i = sub i64 -2, %div.i
+ %div.i = urem i64 1, %add
+ %sub.i1 = add i64 %div.i, -2
llvm-svn: 221317
|
| |
|
|
|
|
|
|
|
|
| |
Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask.
This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests).
Differential Revision: http://reviews.llvm.org/D6015
llvm-svn: 221313
|
| |
|
|
|
|
|
|
|
|
| |
change LoopSimplifyPass to be !isCFGOnly. The motivation for the earlier patch
(r221223) was that LoopSimplify is not preserved by instcombine though
setPreservesCFG indicates that it is. This change fixes the issue
by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less
invasive.
llvm-svn: 221311
|
| |
|
|
|
|
|
|
|
| |
While fixing up the register classes in the machine combiner in a previous
commit I missed one.
This fixes the last one and adds a test case.
llvm-svn: 221308
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r221299.
The tests
LLVM :: MC/Disassembler/Mips/mips32.txt
LLVM :: MC/Disassembler/Mips/mips32_le.txt
were failing.
llvm-svn: 221307
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
symbolication without needing the .dwo files
Clang -gsplit-dwarf self-host -O0, binary increases by 0.0005%, -O2,
binary increases by 25%.
A large binary inside Google, split-dwarf, -O0, and other internal flags
(GDB index, etc) increases by 1.8%, optimized build is 35%.
The size impact may be somewhat greater in .o files (I haven't measured
that much - since the linked executable -O0 numbers seemed low enough)
due to relocations. These relocations could be removed if we taught the
llvm-symbolizer to handle indexed addressing in the .o file (GDB can't
cope with this just yet, but GDB won't be reading this info anyway).
Also debug_ranges could be shared between .o and .dwo, though ideally
debug_ranges would get a schema that could used index(+offset)
addressing, and move to the .dwo file, then we'd be back to sharing
addresses in the address pool again.
But for now, these sizes seem small enough to go ahead with this.
Verified that no other DW_TAGs are produced into the .o file other than
subprograms and inlined_subroutines.
llvm-svn: 221306
|
| |
|
|
|
|
| |
with fission-gmlt data and produce skeleton<>full unit cross referencing.
llvm-svn: 221305
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were producing a relocation for
----------------
.section foo,bar
La:
Lb:
.long La-Lb
--------------
but not for
---------------------
.section foo,bar
zed:
La:
Lb:
.long La-Lb
----------------
This patch handles the case where both fragments are part of the first atom
in a section and there is no corresponding symbol to that atom.
This fixes pr21328.
llvm-svn: 221304
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
MipsInstrInfo.td. NFC.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5843
llvm-svn: 221300
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5763
llvm-svn: 221299
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and FeatureSSE4A for all the bdver* cpus.
This patch adds 'FeatureSlowSHLD' to 'bdver3'.
According to the official AMD optimization guide for amdfam15: "Using
alternative code in place of SHLD achieves lower overall latency and
requires fewer execution resources. The 32-bit and 64-bit forms of
ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath
instructions, while SHLD is a VectorPath instruction."
This patch also explicitly sets feature AVX and SSE4A for all the bdver*
cpus. This part of the patch is a non-functional change and it is mainly
done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A).
llvm-svn: 221296
|
| |
|
|
|
|
|
|
|
|
|
| |
Registers are not all equal. Some are not allocatable (infinite cost),
some have to be preserved but can be used, and some others are just free
to use.
Ensure there is a cost hierarchy reflecting this fact, so that the
allocator will favor scratch registers over callee-saved registers.
llvm-svn: 221293
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch improves how the different costs (register, interference, spill
and coalescing) relates together. The assumption is now that:
- coalescing (or any other "side effect" of reg alloc) is negative, and
instead of being derived from a spill cost, they use the block
frequency info.
- spill costs are in the [MinSpillCost:+inf( range
- register or interference costs are in [0.0:MinSpillCost( or +inf
The current MinSpillCost is set to 10.0, which is a random value high
enough that the current constraint builders do not need to worry about
when settings costs. It would however be worth adding a normalization
step for register and interference costs as the last step in the
constraint builder chain to ensure they are not greater than SpillMinCost
(unless this has some sense for some architectures). This would work well
with the current builder pipeline, where all costs are tweaked relatively
to each others, but could grow above MinSpillCost if the pipeline is
deep enough.
The current heuristic is tuned to depend rather on the number of uses of
a live interval rather than a density of uses, as used by the greedy
allocator. This heuristic provides a few percent improvement on a number
of benchmarks (eembc, spec, ...) and will definitely need to change once
spill placement is implemented: the current spill placement is really
ineficient, so making the cost proportionnal to the number of use is a
clear win.
llvm-svn: 221292
|
| |
|
|
| |
llvm-svn: 221291
|
| |
|
|
|
|
| |
This kind of pattern is emitted by the loop vectorizer.
llvm-svn: 221289
|
| |
|
|
|
|
| |
by Kumar Sukhani
llvm-svn: 221288
|
| |
|
|
|
|
| |
No functionality change intended, it's just a little more concise.
llvm-svn: 221281
|
| |
|
|
|
|
| |
No functionality change intended, it's just a little more concise.
llvm-svn: 221280
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Appropriately set/clear the FeatureBit for Mips16 when these assembler directives are used and also emit ".set nomips16" (previously, only ".set mips16" was being emitted).
These improvements allow for better testing of the .cpload/.cprestore assembler directives (which are not supposed to work when Mips16 is enabled).
Test Plan: The test is bare-bones because there are no MC tests for Mips16 instructions (there's only one, which checks that the Mips16 ELF header flag gets set), and that suggests to me that it has not been implemented yet in the IAS.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5462
llvm-svn: 221277
|
| |
|
|
| |
llvm-svn: 221274
|
| |
|
|
| |
llvm-svn: 221273
|
| |
|
|
| |
llvm-svn: 221268
|
| |
|
|
| |
llvm-svn: 221258
|
| |
|
|
|
|
| |
signed/unsigned mismatch.
llvm-svn: 221252
|
| |
|
|
|
|
|
|
|
| |
SearchPath() to appease clang Driver's tests.
It seems SearchPath() doesn't show actual extension on the filesystem.
FIXME: Shall we use FindFirstFile() here?
llvm-svn: 221246
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is experimental, just barely enough to get things to not
immediately combust.
A note for those who are curious:
Only lld can successfully link the object files, other linkers truncate
the section names making the debug sections illegible to debuggers.
Even with this in mind, we believe we are having trouble with SECREL
relocations.
llvm-svn: 221245
|
| |
|
|
|
|
|
|
|
|
|
|
| |
1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46):
error C2146: syntax error : missing ';' before identifier 'nLength'
1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46):
error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
...
including <windows.h> is actually required.
llvm-svn: 221244
|
| |
|
|
| |
llvm-svn: 221228
|
| |
|
|
|
|
|
| |
This reverts commit r220811 and r220839. It made an incorrect change to
musttail handling.
llvm-svn: 221226
|
| |
|
|
|
|
|
|
|
| |
preserve LoopSimplify because instcombine may replace branch predicates
with undef which loop simplify then replaces with always exit. Replace
setPreservesCFG with the more constrained preservation of DomTree and
LoopInfo.
llvm-svn: 221223
|
| |
|
|
| |
llvm-svn: 221221
|
| |
|
|
| |
llvm-svn: 221220
|
| |
|
|
|
|
|
|
| |
a public version MachOObjectFile::getScatteredRelocationType().
This should fix the build bot for the unused function error.
llvm-svn: 221216
|
| |
|
|
|
|
|
| |
the tombstone or empty keys of a DenseMap<int64_t, T>. This patch
fixes the issue (and adds a tests case).
llvm-svn: 221214
|
| |
|
|
|
|
| |
symbolizer.
llvm-svn: 221211
|
| |
|
|
| |
llvm-svn: 221210
|
| |
|
|
|
|
|
|
| |
predicate instead of bitwise operations.
This is not a functional change.
llvm-svn: 221209
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D6062
llvm-svn: 221204
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
LoadCombine can be smarter about aborting when a writing instruction is
encountered, instead of aborting upon encountering any writing instruction, use
an AliasSetTracker, and only abort when encountering some write that might
alias with the loads that could potentially be combined.
This was originally motivated by comments made (and a test case provided) by
David Majnemer in response to PR21448. It turned out that LoadCombine was not
responsible for that PR, but LoadCombine should also be improved so that
unrelated stores (and @llvm.assume) don't interrupt load combining.
llvm-svn: 221203
|
| |
|
|
|
|
|
|
| |
This generalizes the range handling for ranges in both the skeleton and
full unit, laying the foundation for the addition of more ranges (rather
than just the CU's special case) in the skeleton CU with fission+gmlt.
llvm-svn: 221202
|
| |
|
|
| |
llvm-svn: 221199
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
FoldOpIntoPhi could create an infinite loop if the PHI could potentially
reach a BB it was considering inserting instructions into. The
instructions it would insert would eventually lead to other combines
firing which would, again, lead to FoldOpIntoPhi firing.
The solution is to handicap FoldOpIntoPhi so that it doesn't attempt to
insert instructions that the PHI might reach.
This fixes PR21377.
llvm-svn: 221187
|