| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 291195
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28303
llvm-svn: 291194
|
| |
|
|
| |
llvm-svn: 291193
|
| |
|
|
|
|
| |
Remove unnecessary braces, remove one use variables and keep LUTs to similar naming convention.
llvm-svn: 291187
|
| |
|
|
| |
llvm-svn: 291182
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Extend AArch64 foldMemoryOperandImpl() to handle folding spills of
subreg COPYs with read-undef defs like:
%vreg0:sub_32<def,read-undef> = COPY %WZR; GPR64:%vreg0
by widening the spilled physical source reg and generating:
STRXui %XZR <fi#0>
as well as folding fills of similar COPYs like:
%vreg0:sub_32<def,read-undef> = COPY %vreg1; GPR64:%vreg0, GPR32:%vreg1
by generating:
%vreg0:sub_32<def,read-undef> = LDRWui <fi#0>
Reviewers: MatzeB, qcolombet
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D27425
llvm-svn: 291180
|
| |
|
|
| |
llvm-svn: 291178
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Using the linker-supplied list of "preserved" symbols, we can compute
the list of "dead" symbols, i.e. the one that are not reachable from
a "preserved" symbol transitively on the reference graph.
Right now we are using this information to mark these functions as
non-eligible for import.
The impact is two folds:
- Reduction of compile time: we don't import these functions anywhere
or import the function these symbols are calling.
- The limited number of import/export leads to better internalization.
Patch originally by Mehdi Amini.
Reviewers: mehdi_amini, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23488
llvm-svn: 291177
|
| |
|
|
|
|
|
| |
do not use .cfi_sections. This requires checking if any non-declaration
function in the module needs an unwind table.
llvm-svn: 291172
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Promotion is always legal when a store within the loop is guaranteed to execute.
However, this is not a necessary condition - for promotion to be memory model
semantics-preserving, it is enough to have a store that dominates every exit
block. This is because if the store dominates every exit block, the fact the
exit block was executed implies the original store was executed as well.
Differential Revision: https://reviews.llvm.org/D28147
llvm-svn: 291171
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an assert that checks whether liveins are up to date before they are
used.
- Do not print liveins into .mir files anymore in situations where they
are out of date anyway.
- The assert in the RegisterScavenger is superseded by the new one in
livein_begin().
- Skip parts of the liveness updating logic in IfConversion.cpp when
liveness isn't tracked anymore (just enough to avoid hitting the new
assert()).
Differential Revision: https://reviews.llvm.org/D27562
llvm-svn: 291169
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This reverts commit r291144. It breaks build bots.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/3270, http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/2058
lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:1638:12: error: could not convert ‘(const unsigned int*)(& Variants)’ from ‘const unsigned int*’ to ‘llvm::ArrayRef<unsigned int>’
return Variants;
Reviewers: eugenis, tstellarAMD
Patch by Alex Shlyapnikov.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D28372
llvm-svn: 291168
|
| |
|
|
| |
llvm-svn: 291165
|
| |
|
|
| |
llvm-svn: 291163
|
| |
|
|
|
|
| |
NFCI.
llvm-svn: 291162
|
| |
|
|
|
|
| |
Removes need for yet another LUT.
llvm-svn: 291158
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28363
llvm-svn: 291157
|
| |
|
|
|
|
| |
Remove SSE2 256-bit entries - AVX targets will have used the SSE42 costs instead.
llvm-svn: 291152
|
| |
|
|
|
|
|
|
| |
extract/insertion in AVX1 v4i64 MUL
Matches other MUL/ADD/SUB 256-bit case on AVX1
llvm-svn: 291149
|
| |
|
|
| |
llvm-svn: 291147
|
| |
|
|
|
|
| |
shuffle cost LUTs. NFCI.
llvm-svn: 291146
|
| |
|
|
|
|
| |
Arrays are supposed to be static const
llvm-svn: 291144
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Preheader instruction's operands will always be invariant w.r.t. the loop which its the preheader
for.
Memory aliases are handled in canSinkOrHoistInst.
Reviewers: danielcdh, davidxl
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28270
llvm-svn: 291132
|
| |
|
|
| |
llvm-svn: 291126
|
| |
|
|
|
|
|
|
| |
Currently only for broadcasts with input and output of the same width.
Differential Revision: https://reviews.llvm.org/D27811
llvm-svn: 291122
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For instructions such as PSLLW/PSLLD/PSLLQ a variable shift amount may be passed in an XMM register.
The lower 64-bits of the register are evaluated to determine the shift amount.
This patch improves the construction of the vector containing the shift amount.
Reviewers: craig.topper, delena, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28353
llvm-svn: 291120
|
| |
|
|
|
|
| |
Fixes a warning about "||" and "&&" due to r291108.
llvm-svn: 291119
|
| |
|
|
|
|
|
| |
Instructions: fctidu[.], fctiwu[.], ftdiv, ftsqrt are not implemented. Implement
them and add corresponding test cases in this patch.
llvm-svn: 291116
|
| |
|
|
|
|
|
|
|
| |
Should fix some more bot failures from r291108.
This should have been a DenseSet, since GUID is not a pointer type.
It caused some bots to fail, but for some reason I wasnt't getting a
build failure.
llvm-svn: 291115
|
| |
|
|
| |
llvm-svn: 291109
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds a new summary flag NotEligibleToImport that subsumes
several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal
and IsNotViableToInline). It also subsumes the checking of references
on the summary that was being done during the thin link by
eligibleForImport() for each candidate. It is much more efficient to
do that checking once during the per-module summary build and record
it in the summary.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28169
llvm-svn: 291108
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
stride seems to be 'complex' and need some extra cost for address computation handling.
This code seems to be target dependent which may not be the same for all targets.
Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'.
Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general.
Differential Revision: https://reviews.llvm.org/D27518
llvm-svn: 291106
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To make this work, pointers from the MachineBasicBlock to the LLVM-IR-level
basic blocks need to be initialized, as the AsmPrinter uses this link to be
able to print out labels for the basic blocks that are address-taken.
Most of the changes in this commit are about adapting existing tests to include
the basic block name that is now printed out in the MIR format, now that the
name becomes available as the link to the LLVM-IR basic block is initialized.
The relevant test change for the functionality added in this patch are the
added "(address-taken)" strings in
test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll.
Differential Revision: https://reviews.llvm.org/D28123
llvm-svn: 291105
|
| |
|
|
|
|
|
|
|
|
| |
This commit does this using a trivial chain of conditional branches. In the
future, we probably want to reuse the optimized switch lowering used in
SelectionDAG.
Differential Revision: https://reviews.llvm.org/D28176
llvm-svn: 291099
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28175
llvm-svn: 291097
|
| |
|
|
| |
llvm-svn: 291095
|
| |
|
|
|
|
|
| |
DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512.
Differential revision: https://reviews.llvm.org/D28216
llvm-svn: 291092
|
| |
|
|
|
|
|
|
| |
"skylake" since there are no feature differences.
Model numbers found here http://www.sandpile.org/x86/cpuid.htm
llvm-svn: 291086
|
| |
|
|
|
|
|
| |
This is needed to support inclusion in inline assembly via the
`.include` directive.
llvm-svn: 291085
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of "skylake-avx512". Add the proper 0x55 model for "skylake-avx512".
Summary:
Intel's i5-6300U CPU is reporting to have a model id of 78 (4e).
The Host detection assumes that to be Skylake Xeon (with AVX512 support),
instead of a normal Skylake machine.
Patch by: Valentin Churavy
Reviewers: nalimilan, craig.topper
Subscribers: hfinkel, tkelman, craig.topper, nalimilan, llvm-commits
Differential Revision: https://reviews.llvm.org/D28221
llvm-svn: 291084
|
| |
|
|
| |
llvm-svn: 291078
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scaffolding for lowertypetests.
Set up basic YAML I/O support for module summaries, plumb the summary into
the pass and add a few command line flags to test YAML I/O support. Bitcode
support to come separately, as will the code in LowerTypeTests that actually
uses the summary. Also add a couple of tests that pass by virtue of the pass
doing nothing with the summary (which happens to be the correct thing to do
for those tests).
Differential Revision: https://reviews.llvm.org/D28041
llvm-svn: 291069
|
| |
|
|
|
|
|
| |
This caused buildbot failures due to returning ArrayRefs referencing local
(temporary) objects.
llvm-svn: 291067
|
| |
|
|
|
|
|
|
|
|
|
| |
GVN
performing partial redundancy elimination (PRE). Not doing so can cause jumpy line
tables and confusing (though correct) source attributions.
Differential Revision: https://reviews.llvm.org/D27857
llvm-svn: 291037
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a relatively simple scheme: we use the index emitted in the
bitcode to avoid loading all the global metadata. Instead we load
the index with their position in the bitcode so that we can load each
of them individually. Materializing the global metadata block in this
condition only triggers loading the named metadata, and the ones
referenced from there (transitively). When materializing a function,
metadata from the global block are loaded lazily as they are
referenced.
Two main current limitations are:
1) Global values other than functions are not materialized on demand,
so we need to eagerly load METADATA_GLOBAL_DECL_ATTACHMENT records
(and their transitive dependencies).
2) When we load a single metadata, we don't recurse on the operands,
instead we use a placeholder or a temporary metadata. Unfortunately
tepmorary nodes are very expensive. This is why we don't have it
always enabled and only for importing.
These two limitations can be lifted in a subsequent improvement if
needed.
With this change, the total link time of opt with ThinLTO and Debug
Info enabled is going down from 282s to 224s (~20%).
Reviewers: pcc, tejohnson, dexonsmith
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28113
llvm-svn: 291027
|
| |
|
|
| |
llvm-svn: 291026
|
| |
|
|
| |
llvm-svn: 291025
|
| |
|
|
|
|
| |
Also cos(fabs(x)) -> cos(x)
llvm-svn: 291022
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IntrusiveRefCntPtr""
If this is a problem for anyone (shared_ptr is two pointers in size,
whereas IntrusiveRefCntPtr is 1 - and the ref count control block that
make_shared adds is probably larger than the one int in RefCountedBase)
I'd prefer to address this by adding a lower-overhead version of
shared_ptr (possibly refactoring IntrusiveRefCntPtr into such a thing)
to avoid the intrusiveness - this allows memory ownership to remain
orthogonal to types and at least to me, seems to make code easier to
understand (since no implicit ownership acquisition can happen).
This recommits 291006, reverted in r291007.
llvm-svn: 291016
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When promoting fp-to-uint16 to fp-to-sint32, the result is actually zero
extended. For example, given double 65534.0, without legalization:
fp-to-uint16: 65534.0 -> 0xfffe
With the legalization:
fp-to-sint32: 65534.0 -> 0x0000fffe
Without this patch, legalization wrongly emits a signed extend assertion,
which is consumed by later icmp instruction, and cause miscompile.
Note that the floating point value must be in [0, 65535), otherwise the
behavior is undefined.
This patch reverts r279223 behavior and adds more tests and
documentations.
In PR29041's context, James Molloy mentioned that:
We don't need to mask because conversion from float->uint8_t is
undefined if the integer part of the float value is not representable in
uint8_t. Therefore we can assume this doesn't happen!
which is totally true and good, because fptoui is documented clearly to
have undefined behavior when overflow/underflow happens. We should take
the advantage of this behavior so that we can save unnecessary mask
instructions.
Reviewers: jmolloy, nadav, echristo, kbarton
Subscribers: mehdi_amini, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28284
llvm-svn: 291015
|