summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Use ISD::SIGN_EXTEND instead of X86ISD::VSEXT for mask to xmm/ymm/zmm ↵Craig Topper2018-01-242-3/+21
| | | | | | | | | | | | | | conversion There are a couple tricky things with this patch. I had to add an override of isVectorLoadExtDesirable to stop DAG combine from combining sign_extend with loads after legalization since we legalize sextload using a load+sign_extend. Overriding this hook actually prevents a lot sextloads from being created in the first place. I also had to add isel patterns because DAG combine blindly combines sign_extend+truncate to a smaller sign_extend which defeats what legalization was trying to do. Differential Revision: https://reviews.llvm.org/D42407 llvm-svn: 323301
* [Dominators] Introduce DomTree verification levelsJakub Kuderski2018-01-241-25/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, there are 2 ways to verify a DomTree: * `DT.verify()` -- runs full tree verification and checks all the properties and gives a reason why the tree is incorrect. This is run by when EXPENSIVE_CHECKS are enabled or when `-verify-dom-info` flag is set. * `DT.verifyDominatorTree()` -- constructs a fresh tree and compares it against the old one. This does not check any other tree properties (DFS number, levels), nor ensures that the construction algorithm is correct. Used by some passes inside assertions. This patch introduces DomTree verification levels, that try to close the gape between the two ways of checking trees by introducing 3 verification levels: - Full -- checks all properties, but can be slow (O(N^3)). Used when manually requested (e.g. `assert(DT.verify())`) or when `-verify-dom-info` is set. - Basic -- checks all properties except the sibling property, and compares the current tree with a freshly constructed one instead. This should catch almost all errors, but does not guarantee that the construction algorithm is correct. Used when EXPENSIVE checks are enabled. - Fast -- checks only basic properties (reachablility, dfs numbers, levels, roots), and compares with a fresh tree. This is meant to replace the legacy `DT.verifyDominatorTree()` and in my tests doesn't cause any noticeable performance impact even in the most pessimistic examples. When used to verify dom tree wrapper pass analysis on sqlite3, the 3 new levels make `opt -O3` take the following amount of time on my machine: - no verification: 8.3s - `DT.verify(VerificationLevel::Fast)`: 10.1s - `DT.verify(VerificationLevel::Basic)`: 44.8s - `DT.verify(VerificationLevel::Full)`: 1m 46.2s (and the previous `DT.verifyDominatorTree()` is within the noise of the Fast level) This patch makes `DT.verifyDominatorTree()` pick between the 3 verification levels depending on EXPENSIVE_CHECKS and `-verify-dom-info`. Reviewers: dberlin, brzycki, davide, grosser, dmgreen Reviewed By: dberlin, brzycki Subscribers: MatzeB, llvm-commits Differential Revision: https://reviews.llvm.org/D42337 llvm-svn: 323298
* Don't assume a null GV is local for ELF and MachO.Rafael Espindola2018-01-241-7/+14
| | | | | | | | | | | This is already a simplification, and should help with avoiding a plt reference when calling an intrinsic with -fno-plt. With this change we return false for null GVs, so the caller only needs to check the new metadata to decide if it should use foo@plt or *foo@got. llvm-svn: 323297
* Remove set but unused variable IsUndef.Eric Christopher2018-01-241-2/+1
| | | | llvm-svn: 323295
* X86: Update isVectorShiftByScalarCheap with cases covered by AVX512BWZvi Rackover2018-01-241-0/+4
| | | | | | | | | | | | | | | | Summary: AVX512BW adds support for variable shift amount for 16-bit element vectors. Reviewers: craig.topper, RKSimon, spatel Reviewed By: RKSimon Subscribers: rengolin, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D42437 llvm-svn: 323292
* [GISel]: Remove redundant copies at the end of ISelAditya Nandakumar2018-01-241-0/+32
| | | | | | | | | https://reviews.llvm.org/D42402 A lot of these copies are useless (copies b/w VRegs having the same regclass) and should be cleaned up. llvm-svn: 323291
* [WebAssembly] Add minor helper functions to WasmObjectFileSam Clegg2018-01-241-9/+18
| | | | | | | | Also, fix crash when exporting an imported function. Differential Revision: https://reviews.llvm.org/D42454 llvm-svn: 323290
* AArch64: Cyclone: Remove SlowMisaligned128Store tuning flagMatthias Braun2018-01-241-1/+0
| | | | | | | | | | | | | | | Remove FeatureSlowMisaligned128Store from cyclone flags. This flag causes splitting of 16 byte wide stores into 2 stored of 8 bytes. This was useful on older apple CPUs which were slow for 16byte stores that were not aligned on 16byte. As the compiler often cannot predict the actual alignment, the splitting was choosen. This has been a topic for a lot of debate as the splitting also decreases performance for some benchmarks. Measuring the effects on newer apple chips (rdar://35525421) shows that it harms more cases than it helps. So it is time to retire this workaround. llvm-svn: 323289
* [TblGen] Inline an (almost) trivial accessor. No functionality change.Benjamin Kramer2018-01-231-4/+0
| | | | llvm-svn: 323276
* BlockExtractor: Remove unused variable. NFC.Volkan Keles2018-01-231-1/+1
| | | | llvm-svn: 323271
* [PPC] Avoid incorrect fp-i128-fp lowering.Tim Shen2018-01-231-2/+4
| | | | | | | | | | | | | | | Summary: Fix an issue that's similar to what D41411 fixed: float(__int128(float_var)) shouldn't be optimized to xscvdpsxds + xscvsxdsp, as they mean (float)(int64_t)float_var. Reviewers: jtony, hfinkel, echristo Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D42400 llvm-svn: 323270
* [llvm-extract] Support extracting basic blocksVolkan Keles2018-01-234-153/+176
| | | | | | | | | | | | | | | | Summary: Currently, there is no way to extract a basic block from a function easily. This patch extends llvm-extract to extract the specified basic block(s). Reviewers: loladiro, rafael, bogner Reviewed By: bogner Subscribers: hintonda, mgorny, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D41638 llvm-svn: 323266
* [X86] Merge some regular expressions in Zen scheduler model and remove 2 ↵Craig Topper2018-01-231-14/+2
| | | | | | | | unused classes. I don't know if the unused classes were intended to be used and that the VEX version is really different than the legacy SSE version. Agner's tables don't show any differences. I'm just cleaning up assuming the current behavior is correct. llvm-svn: 323263
* [X86] Remove 'Int_' from instregexs in Zen scheduler model.Craig Topper2018-01-231-12/+12
| | | | | | No instructions have Int_ at the beginning. It's always at the end now. So it should be picked up as a prefix match llvm-svn: 323262
* [X86] Move 'Int_' to the end of the name of the VCOMISS/VUCOMISS and ↵Craig Topper2018-01-233-40/+40
| | | | | | | | instructions to get them picked up by the scheduler model regexs. All other intrinsic instructions put the _Int on the end. This make these instructions consistent and gets the prefix instregexs in the scheduler models to pick them up. llvm-svn: 323261
* [X86][AVX] LowerBUILD_VECTORAsVariablePermute - add support for VPERMILPV to ↵Simon Pilgrim2018-01-231-49/+72
| | | | | | | | | | | | | | v2i64/v2f64 Minor refactor to make it possible for LowerBUILD_VECTORAsVariablePermute to be used with a wider variety of shuffles op and types. I'd have liked to add v4i32/v4f32 support as well but we don't see v4i32 index extractions at the moment (which is why I created D42308) After this I intend to begin adding scaling support for PSHUFB (v8i16, v4i32, v2i64)) and VPERMPS (v4f64, v4i64). Differential Revision: https://reviews.llvm.org/D42431 llvm-svn: 323260
* [safestack] Inline safestack pointer access when possible.Evgeniy Stepanov2018-01-231-1/+50
| | | | | | | | | | | | | | | Summary: This adds an -mllvm flag that forces the use of a runtime function call to get the unsafe stack pointer, the same that is currently used on non-x86, non-aarch64 android. The call may be inlined. Reviewers: pcc Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37405 llvm-svn: 323259
* Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2018-01-231-1/+1
| | | | llvm-svn: 323258
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-231-343/+121
| | | | | | | | as shuffle." This reverts commit r323246 because of the broken buildbots. llvm-svn: 323252
* [Hexagon] Add patterns for sext_inreg of HVX vector typesKrzysztof Parzyszek2018-01-231-0/+19
| | | | llvm-svn: 323250
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-231-121/+343
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323246
* [Hexagon] Implement hasLoadFromStackSlot and hasStoreToStackSlotKrzysztof Parzyszek2018-01-232-0/+50
| | | | | | | | If the instruction is a bundle, check the instructions inside of it. Patch by Suyog Sarda. llvm-svn: 323240
* Introduce errorToBool() helper and use it.Nico Weber2018-01-231-16/+4
| | | | | | | | | errorToBool() converts an Error to a bool and puts the Error in a checked state. No behavior change. https://reviews.llvm.org/D42422 llvm-svn: 323238
* [WebAssembly] Remove "name" section of object wasm object filesSam Clegg2018-01-231-33/+0
| | | | | | | | | | | | | LLD is unaffected, no changes needed there. LLD continues to write out a name section, using the symbol names. Fixes: https://github.com/WebAssembly/tool-conventions/issues/37 Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42425 llvm-svn: 323234
* [Hexagon] Fix unused variable warning in release buildKrzysztof Parzyszek2018-01-231-0/+1
| | | | llvm-svn: 323233
* [Hexagon] Implement basic vector operations on vectors vNi1Krzysztof Parzyszek2018-01-238-260/+970
| | | | | | | | | | | In addition to that, make sure that there are no boolean vector types that are associated with multiple register classes. Specifically, remove v32i1 and v64i1 from integer register classes. These types will correspond to results of vector comparisons, and as such should belong to the vector predicate class. Having them in scalar registers as well makes legalization ambiguous. llvm-svn: 323229
* [X86][SSE] LowerBUILD_VECTORAsVariablePermute - extract subvector from ↵Simon Pilgrim2018-01-231-6/+10
| | | | | | oversized index vectors llvm-svn: 323223
* [WebAssembly] Add mem.* intrinsics.Dan Gohman2018-01-231-0/+9
| | | | | | | | | | | | The grow_memory and current_memory instructions are expected to be officially renamed to mem.grow and mem.size. Introduce new intrinsics with the new names. These new names aren't yet official, so for now, use them at your own risk. Also, take this opportunity to add arguments for the currently unused immediate field in those instructions. llvm-svn: 323222
* [WebAssembly] Switch to *-wasm as the default target triple.Dan Gohman2018-01-231-2/+4
| | | | | | | | This makes wasm32-unknown-unknown-wasm the default, which supports the .o file writer and the new linking ABI. To enable s2wasm-compatible output, use the wasm32-unknown-unknown-elf triple. llvm-svn: 323220
* Verifier: fix bug treating debug info issue as non-debug info issueYaxun Liu2018-01-231-2/+2
| | | | | | | | | | | | | | | Normally when llvm-as sees only debug info errors in LLVM assembly, it simply drops the debug info and outputs a valid LLVM bitcode and returns 0. There is a bug in LLVM verifier which incorrectly treats a debug info error as non-debug info error, which causes llvm-as returns 1 even though llvm-as can drop the invalid debug info and outputs a valid LLVM bitcode. This patch fixes that. Differential Revision: https://reviews.llvm.org/D42391 llvm-svn: 323216
* CodeGen: Fix assertion in ScheduleDAGMILive::scheduleMI due to llvm.dbg.valueYaxun Liu2018-01-231-0/+1
| | | | | | | | | | Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value. This issues causes amdgcn target to assert when compiling some user codes with -g. Differential Revision: https://reviews.llvm.org/D42394 llvm-svn: 323214
* [X86] Rewrite vXi1 element insertion by using a vXi1 scalar_to_vector and ↵Craig Topper2018-01-231-67/+5
| | | | | | | | | | inserting into a vXi1 vector. The existing code was already doing something very similar to subvector insertion so this allows us to remove the nearly duplicate code. This patch is a little larger than it should be due to differences between the DQI handling between the two today. llvm-svn: 323212
* [X86][SSE] LowerBUILD_VECTORAsVariablePermute - ensure that the source ↵Simon Pilgrim2018-01-231-1/+3
| | | | | | | | vector is not larger than the destination We might be able to support this in the future with VPERMV3, OR(PSHUFB, PSHUFB) etc. llvm-svn: 323210
* Use EVT::changeVectorElementTypeToInteger() to convert index type to integerSimon Pilgrim2018-01-231-4/+4
| | | | llvm-svn: 323207
* [X86][SSE] LowerBUILD_VECTORAsVariablePermute - ensure that the index vector ↵Simon Pilgrim2018-01-231-0/+4
| | | | | | has the correct number of elements llvm-svn: 323206
* AArch64: get type from correct result when forming BFXTim Northover2018-01-231-1/+1
| | | | | | | Some nodes produce multiple values so when obtaining the type of an ISD::OR we need to make sure we ask for the correct one. Hopefully that's all of them. llvm-svn: 323205
* AArch64: get type from correct result when forming BFI/BFMTim Northover2018-01-231-1/+1
| | | | | | | Some nodes produce multiple values so when obtaining the type of an ISD::OR we need to make sure we ask for the correct one. llvm-svn: 323202
* [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the ↵Craig Topper2018-01-232-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | default of promoting to v32i8. Summary: For the most part its better to keep v32i1 as a mask type of a narrower width than trying to promote it to a ymm register. I had to add some overrides to the methods that get the types for the calling convention so that we still use v32i8 for argument/return purposes. There are still some regressions in here. I definitely saw some around shuffles. I think we probably should move vXi1 shuffle from lowering to a DAG combine where I think the extend and truncate we have to emit would be better combined. I think we also need a DAG combine to remove trunc from (extract_vector_elt (trunc)) Overall this removes something like 13000 CHECK lines from lit tests. Reviewers: zvi, RKSimon, delena, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42031 llvm-svn: 323201
* [X86] Add missing MOVSX/MOVZX instructions to load folding tables.Craig Topper2018-01-231-0/+3
| | | | | | | | I'm not sure there's any way to generate these folding cases especially the movzx ones since even the register form is never emitted by codegen. I'm just adding them to remove the difference with the autogenerated version of the folding table. llvm-svn: 323200
* [CGP] Fix the GV handling in complex addressing modeSerguei Katkov2018-01-231-15/+21
| | | | | | | | | | | | | | | If in complex addressing mode the difference is in GV then base reg should not be installed because we plan to use base reg as a merge point of different GVs. This is a fix for PR35980. Reviewers: reames, john.brawn, santosh Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42230 llvm-svn: 323192
* [X86][SSE] LowerBUILD_VECTORAsVariablePermute - fix PSHUFB source/index ↵Simon Pilgrim2018-01-231-2/+3
| | | | | | | | | | operand ordering As detailed in rL317463, PSHUFB (like most variable shuffle instructions) uses Op[0] for the source vector and Op[1] for the shuffle index vector, VPERMV works in reverse which is probably where the confusion comes from. Differential Revision: https://reviews.llvm.org/D42380 llvm-svn: 323190
* [Analysis] Disable exp/exp2/pow finite lib calls on Android with -ffast-math.MinSeong Kim2018-01-231-0/+9
| | | | | | | | | | | | | | | | | | | Summary: Since r322087, glibc's finite lib calls are generated when possible. However, glibc is not supported on Android. Therefore this change enables llvm to finely distinguish between linux and Android for unsupported library calls. The change also include some regression tests. Reviewers: srhines, pirama Reviewed By: srhines Subscribers: kongyi, chh, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42288 llvm-svn: 323187
* [mips] Properly select abs and sqrt instructionsStefan Maksimovic2018-01-233-23/+24
| | | | | | | | | | | | | - Alter abs for micromips to have both AFGR64 and FGR64 variants, same as sqrt - Remove sqrt and abs from MicroMips32r6InstrInfo.td, use micromips FGR64 variants - Restrict non-micromips abs/sqrt with NotInMicroMips predicate Differential revision: https://reviews.llvm.org/D41439 llvm-svn: 323184
* This change add's optimization remark in LoopVersioning LICM pass.Ashutosh Nema2018-01-232-2/+59
| | | | | | | | | | | | | Summary: This patch is adding remark messages to the LoopVersioning LICM pass, which will be useful for optimization remark emitter (ORE) infrastructure. Patch by: Deepak Porwal Reviewers: anemet, ashutosh.nema, eastig Subscribers: eastig, vivekvpandya, fhahn, llvm-commits llvm-svn: 323183
* [InstSimplify] (X << Y) % X -> 0Anton Bikineev2018-01-231-0/+7
| | | | llvm-svn: 323182
* [X86] Don't reorder (srl (and X, C1), C2) if (and X, C1) can be matched as a ↵Craig Topper2018-01-231-0/+8
| | | | | | | | | | | | | | | | | | | movzx Summary: If we can match as a zero extend there's no need to flip the order to get an encoding benefit. As movzx is 3 bytes with independent source/dest registers. The shortest 'and' we could make is also 3 bytes unless we get lucky in the register allocator and its on AL/AX/EAX which have a 2 byte encoding. This patch was more impressive before r322957 went in. It removed some of the same Ands that got deleted by that patch. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42313 llvm-svn: 323175
* [X86] Remove 'NOREX' comment from the printing of _NOREX instructions.Craig Topper2018-01-232-7/+7
| | | | | | Some of the NOREX instructions are used in 32-bit mode making this printing confusing. It also doesn't provide a lot of value since you can see the h-register being used by the instruction. llvm-svn: 323174
* [X86] Various vXi1 insertion improvements.Craig Topper2018-01-233-17/+27
| | | | | | Add missing patterns for inserting v1i1 into a zero vector. Use insert_subvector to zero upper bits before inserting an element into a vXi1 vector. Replace kshift based isel pattern with insert_subvector based pattern now that code that caused the pattern has been fixed to emit insert_subvector. llvm-svn: 323173
* NewPM: Add an extension point for the start of the pipeline.David Blaikie2018-01-231-1/+8
| | | | | | | | | | This applies to most pipelines except the LTO and ThinLTO backend actions - it is for use at the beginning of the overall pipeline. This extension point will be used to add the GCOV pass when enabled in Clang. llvm-svn: 323166
* [WebAssembly] Store function index rather than table index in TABLE_INDEX ↵Sam Clegg2018-01-231-59/+53
| | | | | | | | | | | | | | | | | | | | | relocations Relocations of type R_WEBASSEMBLY_TABLE_INDEX represent places where the table index for a given function is needed. While the value stored in this location is a table index, the index in the relocation entry itself is a function index (the index of the function which is to be called indirectly). This is how is was spec'd originally but the LLVM implementation didn't do this. This makes things a little simpler in the linker since the table in the input file can essentially be ignored that the output table can be created purely based on these relocations. Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42080 llvm-svn: 323165
OpenPOWER on IntegriCloud