| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
conversion
There are a couple tricky things with this patch.
I had to add an override of isVectorLoadExtDesirable to stop DAG combine from combining sign_extend with loads after legalization since we legalize sextload using a load+sign_extend. Overriding this hook actually prevents a lot sextloads from being created in the first place.
I also had to add isel patterns because DAG combine blindly combines sign_extend+truncate to a smaller sign_extend which defeats what legalization was trying to do.
Differential Revision: https://reviews.llvm.org/D42407
llvm-svn: 323301
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, there are 2 ways to verify a DomTree:
* `DT.verify()` -- runs full tree verification and checks all the properties and gives a reason why the tree is incorrect. This is run by when EXPENSIVE_CHECKS are enabled or when `-verify-dom-info` flag is set.
* `DT.verifyDominatorTree()` -- constructs a fresh tree and compares it against the old one. This does not check any other tree properties (DFS number, levels), nor ensures that the construction algorithm is correct. Used by some passes inside assertions.
This patch introduces DomTree verification levels, that try to close the gape between the two ways of checking trees by introducing 3 verification levels:
- Full -- checks all properties, but can be slow (O(N^3)). Used when manually requested (e.g. `assert(DT.verify())`) or when `-verify-dom-info` is set.
- Basic -- checks all properties except the sibling property, and compares the current tree with a freshly constructed one instead. This should catch almost all errors, but does not guarantee that the construction algorithm is correct. Used when EXPENSIVE checks are enabled.
- Fast -- checks only basic properties (reachablility, dfs numbers, levels, roots), and compares with a fresh tree. This is meant to replace the legacy `DT.verifyDominatorTree()` and in my tests doesn't cause any noticeable performance impact even in the most pessimistic examples.
When used to verify dom tree wrapper pass analysis on sqlite3, the 3 new levels make `opt -O3` take the following amount of time on my machine:
- no verification: 8.3s
- `DT.verify(VerificationLevel::Fast)`: 10.1s
- `DT.verify(VerificationLevel::Basic)`: 44.8s
- `DT.verify(VerificationLevel::Full)`: 1m 46.2s
(and the previous `DT.verifyDominatorTree()` is within the noise of the Fast level)
This patch makes `DT.verifyDominatorTree()` pick between the 3 verification levels depending on EXPENSIVE_CHECKS and `-verify-dom-info`.
Reviewers: dberlin, brzycki, davide, grosser, dmgreen
Reviewed By: dberlin, brzycki
Subscribers: MatzeB, llvm-commits
Differential Revision: https://reviews.llvm.org/D42337
llvm-svn: 323298
|
| |
|
|
|
|
|
|
|
|
|
| |
This is already a simplification, and should help with avoiding a plt
reference when calling an intrinsic with -fno-plt.
With this change we return false for null GVs, so the caller only
needs to check the new metadata to decide if it should use foo@plt or
*foo@got.
llvm-svn: 323297
|
| |
|
|
| |
llvm-svn: 323295
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
AVX512BW adds support for variable shift amount for 16-bit element
vectors.
Reviewers: craig.topper, RKSimon, spatel
Reviewed By: RKSimon
Subscribers: rengolin, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D42437
llvm-svn: 323292
|
| |
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D42402
A lot of these copies are useless (copies b/w VRegs having the same
regclass) and should be cleaned up.
llvm-svn: 323291
|
| |
|
|
|
|
|
|
| |
Also, fix crash when exporting an imported function.
Differential Revision: https://reviews.llvm.org/D42454
llvm-svn: 323290
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove FeatureSlowMisaligned128Store from cyclone flags.
This flag causes splitting of 16 byte wide stores into 2 stored of 8
bytes. This was useful on older apple CPUs which were slow for 16byte
stores that were not aligned on 16byte. As the compiler often cannot
predict the actual alignment, the splitting was choosen.
This has been a topic for a lot of debate as the splitting also
decreases performance for some benchmarks. Measuring the effects on
newer apple chips (rdar://35525421) shows that it harms more cases than
it helps. So it is time to retire this workaround.
llvm-svn: 323289
|
| |
|
|
| |
llvm-svn: 323276
|
| |
|
|
| |
llvm-svn: 323271
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix an issue that's similar to what D41411 fixed:
float(__int128(float_var)) shouldn't be optimized to xscvdpsxds +
xscvsxdsp, as they mean (float)(int64_t)float_var.
Reviewers: jtony, hfinkel, echristo
Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton
Differential Revision: https://reviews.llvm.org/D42400
llvm-svn: 323270
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, there is no way to extract a basic block from a function easily. This patch
extends llvm-extract to extract the specified basic block(s).
Reviewers: loladiro, rafael, bogner
Reviewed By: bogner
Subscribers: hintonda, mgorny, qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D41638
llvm-svn: 323266
|
| |
|
|
|
|
|
|
| |
unused classes.
I don't know if the unused classes were intended to be used and that the VEX version is really different than the legacy SSE version. Agner's tables don't show any differences. I'm just cleaning up assuming the current behavior is correct.
llvm-svn: 323263
|
| |
|
|
|
|
| |
No instructions have Int_ at the beginning. It's always at the end now. So it should be picked up as a prefix match
llvm-svn: 323262
|
| |
|
|
|
|
|
|
| |
instructions to get them picked up by the scheduler model regexs.
All other intrinsic instructions put the _Int on the end. This make these instructions consistent and gets the prefix instregexs in the scheduler models to pick them up.
llvm-svn: 323261
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2i64/v2f64
Minor refactor to make it possible for LowerBUILD_VECTORAsVariablePermute to be used with a wider variety of shuffles op and types.
I'd have liked to add v4i32/v4f32 support as well but we don't see v4i32 index extractions at the moment (which is why I created D42308)
After this I intend to begin adding scaling support for PSHUFB (v8i16, v4i32, v2i64)) and VPERMPS (v4f64, v4i64).
Differential Revision: https://reviews.llvm.org/D42431
llvm-svn: 323260
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds an -mllvm flag that forces the use of a runtime function call to
get the unsafe stack pointer, the same that is currently used on non-x86, non-aarch64 android.
The call may be inlined.
Reviewers: pcc
Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D37405
llvm-svn: 323259
|
| |
|
|
| |
llvm-svn: 323258
|
| |
|
|
|
|
|
|
| |
as shuffle."
This reverts commit r323246 because of the broken buildbots.
llvm-svn: 323252
|
| |
|
|
| |
llvm-svn: 323250
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the same value is going to be vectorized several times in the same
tree entry, this entry is considered to be a gather entry and cost of
this gather is counter as cost of InsertElementInstrs for each gathered
value. But we can consider these elements as ShuffleInstr with
SK_PermuteSingle shuffle kind.
Reviewers: spatel, RKSimon, mkuper, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38697
llvm-svn: 323246
|
| |
|
|
|
|
|
|
| |
If the instruction is a bundle, check the instructions inside of it.
Patch by Suyog Sarda.
llvm-svn: 323240
|
| |
|
|
|
|
|
|
|
| |
errorToBool() converts an Error to a bool and puts the Error in a checked
state. No behavior change.
https://reviews.llvm.org/D42422
llvm-svn: 323238
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
LLD is unaffected, no changes needed there. LLD continues to
write out a name section, using the symbol names.
Fixes: https://github.com/WebAssembly/tool-conventions/issues/37
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42425
llvm-svn: 323234
|
| |
|
|
| |
llvm-svn: 323233
|
| |
|
|
|
|
|
|
|
|
|
| |
In addition to that, make sure that there are no boolean vector types that
are associated with multiple register classes. Specifically, remove v32i1
and v64i1 from integer register classes. These types will correspond to
results of vector comparisons, and as such should belong to the vector
predicate class. Having them in scalar registers as well makes legalization
ambiguous.
llvm-svn: 323229
|
| |
|
|
|
|
| |
oversized index vectors
llvm-svn: 323223
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The grow_memory and current_memory instructions are expected to be
officially renamed to mem.grow and mem.size. Introduce new intrinsics
with the new names. These new names aren't yet official, so for now,
use them at your own risk.
Also, take this opportunity to add arguments for the currently unused
immediate field in those instructions.
llvm-svn: 323222
|
| |
|
|
|
|
|
|
| |
This makes wasm32-unknown-unknown-wasm the default, which supports
the .o file writer and the new linking ABI. To enable s2wasm-compatible
output, use the wasm32-unknown-unknown-elf triple.
llvm-svn: 323220
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally when llvm-as sees only debug info errors in LLVM assembly, it simply
drops the debug info and outputs a valid LLVM bitcode and returns 0.
There is a bug in LLVM verifier which incorrectly treats a debug info error
as non-debug info error, which causes llvm-as returns 1 even though llvm-as
can drop the invalid debug info and outputs a valid LLVM bitcode.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D42391
llvm-svn: 323216
|
| |
|
|
|
|
|
|
|
|
| |
Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value.
This issues causes amdgcn target to assert when compiling some user codes with -g.
Differential Revision: https://reviews.llvm.org/D42394
llvm-svn: 323214
|
| |
|
|
|
|
|
|
|
|
| |
inserting into a vXi1 vector.
The existing code was already doing something very similar to subvector insertion so this allows us to remove the nearly duplicate code.
This patch is a little larger than it should be due to differences between the DQI handling between the two today.
llvm-svn: 323212
|
| |
|
|
|
|
|
|
| |
vector is not larger than the destination
We might be able to support this in the future with VPERMV3, OR(PSHUFB, PSHUFB) etc.
llvm-svn: 323210
|
| |
|
|
| |
llvm-svn: 323207
|
| |
|
|
|
|
| |
has the correct number of elements
llvm-svn: 323206
|
| |
|
|
|
|
|
| |
Some nodes produce multiple values so when obtaining the type of an ISD::OR we
need to make sure we ask for the correct one. Hopefully that's all of them.
llvm-svn: 323205
|
| |
|
|
|
|
|
| |
Some nodes produce multiple values so when obtaining the type of an ISD::OR we
need to make sure we ask for the correct one.
llvm-svn: 323202
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
default of promoting to v32i8.
Summary:
For the most part its better to keep v32i1 as a mask type of a narrower width than trying to promote it to a ymm register.
I had to add some overrides to the methods that get the types for the calling convention so that we still use v32i8 for argument/return purposes.
There are still some regressions in here. I definitely saw some around shuffles. I think we probably should move vXi1 shuffle from lowering to a DAG combine where I think the extend and truncate we have to emit would be better combined.
I think we also need a DAG combine to remove trunc from (extract_vector_elt (trunc))
Overall this removes something like 13000 CHECK lines from lit tests.
Reviewers: zvi, RKSimon, delena, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42031
llvm-svn: 323201
|
| |
|
|
|
|
|
|
| |
I'm not sure there's any way to generate these folding cases especially the movzx ones since even the register form is never emitted by codegen.
I'm just adding them to remove the difference with the autogenerated version of the folding table.
llvm-svn: 323200
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If in complex addressing mode the difference is in GV then
base reg should not be installed because we plan to use
base reg as a merge point of different GVs.
This is a fix for PR35980.
Reviewers: reames, john.brawn, santosh
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42230
llvm-svn: 323192
|
| |
|
|
|
|
|
|
|
|
| |
operand ordering
As detailed in rL317463, PSHUFB (like most variable shuffle instructions) uses Op[0] for the source vector and Op[1] for the shuffle index vector, VPERMV works in reverse which is probably where the confusion comes from.
Differential Revision: https://reviews.llvm.org/D42380
llvm-svn: 323190
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Since r322087, glibc's finite lib calls are generated when possible.
However, glibc is not supported on Android. Therefore this change
enables llvm to finely distinguish between linux and Android for
unsupported library calls. The change also include some regression
tests.
Reviewers: srhines, pirama
Reviewed By: srhines
Subscribers: kongyi, chh, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D42288
llvm-svn: 323187
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- Alter abs for micromips to have both AFGR64 and FGR64
variants, same as sqrt
- Remove sqrt and abs from MicroMips32r6InstrInfo.td,
use micromips FGR64 variants
- Restrict non-micromips abs/sqrt with NotInMicroMips
predicate
Differential revision: https://reviews.llvm.org/D41439
llvm-svn: 323184
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is adding remark messages to the LoopVersioning LICM pass,
which will be useful for optimization remark emitter (ORE) infrastructure.
Patch by: Deepak Porwal
Reviewers: anemet, ashutosh.nema, eastig
Subscribers: eastig, vivekvpandya, fhahn, llvm-commits
llvm-svn: 323183
|
| |
|
|
| |
llvm-svn: 323182
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
movzx
Summary:
If we can match as a zero extend there's no need to flip the order to get an encoding benefit. As movzx is 3 bytes with independent source/dest registers. The shortest 'and' we could make is also 3 bytes unless we get lucky in the register allocator and its on AL/AX/EAX which have a 2 byte encoding.
This patch was more impressive before r322957 went in. It removed some of the same Ands that got deleted by that patch.
Reviewers: spatel, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42313
llvm-svn: 323175
|
| |
|
|
|
|
| |
Some of the NOREX instructions are used in 32-bit mode making this printing confusing. It also doesn't provide a lot of value since you can see the h-register being used by the instruction.
llvm-svn: 323174
|
| |
|
|
|
|
| |
Add missing patterns for inserting v1i1 into a zero vector. Use insert_subvector to zero upper bits before inserting an element into a vXi1 vector. Replace kshift based isel pattern with insert_subvector based pattern now that code that caused the pattern has been fixed to emit insert_subvector.
llvm-svn: 323173
|
| |
|
|
|
|
|
|
|
|
| |
This applies to most pipelines except the LTO and ThinLTO backend
actions - it is for use at the beginning of the overall pipeline.
This extension point will be used to add the GCOV pass when enabled in
Clang.
llvm-svn: 323166
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
relocations
Relocations of type R_WEBASSEMBLY_TABLE_INDEX represent places
where the table index for a given function is needed. While the
value stored in this location is a table index, the index in
the relocation entry itself is a function index (the index of
the function which is to be called indirectly).
This is how is was spec'd originally but the LLVM implementation
didn't do this. This makes things a little simpler in the linker
since the table in the input file can essentially be ignored that
the output table can be created purely based on these relocations.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42080
llvm-svn: 323165
|