summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Prevent folding loads with 64-bit ANDs with immediates that fit in ↵Craig Topper2018-04-101-1/+12
| | | | | | | | | | | | 32-bits. Prefer to use the 32-bit AND with immediate instead. Primarily I'm doing this to ensure that immediates created by shrinkAndImmediate will always get absorbed into the AND. But I do believe this would be a reduction in the number of uops that need to execute. Ideally we should shrink the 'and' and the 'load' during DAG combine to re-enable the fold. Fixes PR37063. llvm-svn: 329667
* Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.Michael Zolotukhin2018-04-101-18/+13
| | | | | | This reverts r329661. Bots are still unhappy. llvm-svn: 329666
* Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading.""Michael Zolotukhin2018-04-101-13/+18
| | | | | | This reapplies commit r329644. llvm-svn: 329661
* [SSAUpdaterBulk] Handle CFG with unreachable from entry blocks.Michael Zolotukhin2018-04-101-1/+1
| | | | llvm-svn: 329660
* [DebugInfo][COFF] Fix reading variable-length encoded recordsAlexandre Ganea2018-04-101-2/+2
| | | | | | | | | | | | While reading Codeview records which contain variable-length encoded integers, such as LF_BCLASS, LF_ENUMERATE, LF_MEMBER, LF_VBCLASS or LF_IVBCLASS, the record's size would be improperly calculated in cases where the value was indeed of a variable length (>= LF_NUMERIC). This caused a bad alignement on the next record, which would/might crash later on. Differential Revision: https://reviews.llvm.org/D45104 llvm-svn: 329659
* [x86] Introduce a pass to begin more systematically fixing PR36028 and ↵Chandler Carruth2018-04-108-117/+755
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 llvm-svn: 329657
* ShadowCallStack/x86_64: Ignore pseudo-machine instructionsVlad Tsyrklevich2018-04-101-1/+2
| | | | llvm-svn: 329656
* Object: Don't mark alias unconditionally definedVitaly Buka2018-04-101-1/+4
| | | | | | | | | | | | | | Summary: Can't remove EmitAssignment override as llvm/test/Object/X86/nm-bitcodeweak.test expects this behavior. Reviewers: pcc, espindola Subscribers: mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44596 llvm-svn: 329651
* Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."Michael Zolotukhin2018-04-101-18/+13
| | | | | | This reverts commit r329644. llvm-svn: 329650
* Fix for the buildbot failure. Now-unused private field TTI deleted.Hideki Saito2018-04-101-6/+2
| | | | llvm-svn: 329649
* Fix line endings (CR/LF -> LF) introduced by rL329613Alexandre Ganea2018-04-104-2218/+2218
| | | | | reviewer: zturner llvm-svn: 329646
* [NFC][LV] Move InterleaveInfo from Legal to CostModelHideki Saito2018-04-091-57/+56
| | | | | | | | | | | | | | | | | | | | | | Summary: Another clean up, following D43208. Interleaved memory access analysis/optimization has nothing to do with vectorization legality. It doesn't really belong there. On the other hand, cost model certainly has to know about it. In principle, vectorization should proceed like Legality ==> Optimization ==> CostModel ==> CodeGen, and this change just does that, by moving the interleaved access analysis/decision out of Legal, and run it just before CostModel object is created. After this, I can move LoopVectorizationLegality and Hints/Requirements classes into it's own header file, making it shareable within Transform tree. I have the patch already but I don't want to mix with this change. Eventual goal is to move to Analysis tree, but I first need to move RecurrenceDescriptor/InductionDescriptor from Transform/Util/LoopUtil.* to Analysis. Reviewers: rengolin, hfinkel, mkuper, dcaballe, sguggill, fhahn, aemerson Reviewed By: rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45072 llvm-svn: 329645
* [PR16756] Use SSAUpdaterBulk in JumpThreading.Michael Zolotukhin2018-04-091-13/+18
| | | | | | | | | | | | | | | | | Summary: SSAUpdater is a bottleneck in JumpThreading, and this patch improves the situation by using SSAUpdaterBulk instead. Compile time impact: no noticable changes on CTMark, a big improvement on the test from PR16756. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329644
* [PR16756] Add SSAUpdaterBulk.Michael Zolotukhin2018-04-092-0/+174
| | | | | | | | | | | | | | | | Summary: SSAUpdater is a bottleneck in a number of passes, and one of the reasons is that it performs a lot of unnecessary computations (DT/IDF) over and over again. This patch adds a new SSAUpdaterBulk that uses existing DT and avoids recomputing IDF when possible. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329643
* [MemorySSA] remove cruft; NFC.George Burgess IV2018-04-091-23/+1
| | | | | | | | | | | | The caching walker used to hold its own caches, which made its `reset()` function meaningful. Since caching has been moved out of it, there's no reason to continue to have these cache-related methods. Similarly, the EXPENSIVE_CHECKS block that's getting removed used to rerun the query with caching disabled. Since that's how we always do queries now, it's redundant. llvm-svn: 329638
* [MemorySSA] Remove redundant assert; NFCGeorge Burgess IV2018-04-091-3/+0
| | | | | | | The `if (!Def && !Use) return nullptr;` right above this assert sort of defeats the purpose. llvm-svn: 329632
* [globalisel][legalizerinfo] Add support for the Lower action in ↵Daniel Sanders2018-04-093-10/+14
| | | | | | | | | | | | | | | getActionDefinitionsBuilder() and use it in AArch64. Lower is slightly odd. It often doesn't change the type but the lowerings do use the new type to decide what code to create. Treat it like a mutation but provide convenience functions that re-use the existing type. Re-uses the existing tests: test/CodeGen/AArch64/GlobalISel/legalize-rem.mir test/CodeGen/AArch64/GlobalISel//legalize-mul.mir test/CodeGen/AArch64/GlobalISel//legalize-cmpxchg-with-success.mir llvm-svn: 329623
* Fix printing of stack id in MachineFrameInfoMatt Arsenault2018-04-091-1/+1
| | | | | | | uint8_t is printed as a char, so it needs to be casted to do the right thing. llvm-svn: 329622
* [MemorySSAUpdater] Mark Phi users of a node being moved as non-optimizeZhaoshi Zheng2018-04-091-0/+17
| | | | | | | | | | Fix PR36484, as suggested: <quote> during moves, mark the direct users of the erased things that were phis as "not to be optimized" <quote> llvm-svn: 329621
* AMDGPU: Remove max_scratch_backing_memory_byte_size from kernel headerKonstantin Zhuravlyov2018-04-094-10/+10
| | | | | | | | | | | 1. Remove max_scratch_backing_memory_byte_size from kernel header 2. Make it a reserved field 3. Ignore it while parsing assembly for backwards compatibility 4. Bump up minor version of kernel header Differential Revision: https://reviews.llvm.org/D45452 llvm-svn: 329620
* [X86] Don't use Lower512IntUnary to split bitcasts with v32i16/v64i8 types ↵Craig Topper2018-04-091-7/+22
| | | | | | | | | | | | | | on targets without AVX512BW. LowerIntUnary as its name says has an assert for integer types. But for the bitcast case one side might be an FP type. Rather than making sure the function really works for fp types and renaming it. Just do really basic splitting directly. The LowerIntUnary has the advantage that it can peek through BUILD_VECTOR because every other call is during Lowering. But these calls are during legalization and will be followed by a DAG combine round. Revert some change to LowerVectorIntUnary that were originally made just to make these two calls work even in pure integer cases. This was found purely by compiling the avx512f-builtins.c test from clang so I've copied over the offending function from that. llvm-svn: 329616
* [Debuginfo][COFF] Minimal serialization support for precompiled types recordsAlexandre Ganea2018-04-097-2168/+2226
| | | | | | | | | | | | | This change adds support for the LF_PRECOMP and LF_ENDPRECOMP records required to read/write Microsoft precompiled types .objs. See https://en.wikipedia.org/wiki/Precompiled_header#Microsoft_Visual_C_and_C++ This also adds handling for the .debug$P section, which is actually a .debug$T section in disguise, found only in precompiled .objs. Differential Revision: https://reviews.llvm.org/D45283 llvm-svn: 329613
* AArch64: Allow offsets to be folded into addresses with ELF.Peter Collingbourne2018-04-092-17/+24
| | | | | | | | | | | | | | | This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. It reduces the size of Chromium for Android's .text section by 46KB, or 56KB with ThinLTO (which exposes more opportunities to use a direct access rather than a GOT access). Because the addend range is limited in COFF and Mach-O, this is enabled for ELF only. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329611
* Revert "AMDGPU: enable 128-bit for local addr space under an option"Alex Shlyapnikov2018-04-095-17/+12
| | | | | | | | | | | | | | This reverts commit r329591. It breaks various bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/16516 http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/17374 http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/15992 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/11251 ... llvm-svn: 329610
* [WebAssembly] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-091-10/+10
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: sunfish, RKSimon Reviewed By: sunfish Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D44873 llvm-svn: 329607
* Fix type mismatch between MachineMemOperand constructor and accessors. NFCDaniel Sanders2018-04-091-1/+1
| | | | | | | | This allows MachineMemOperand::getSize()'s result to be fed directly into MachineMemOperand::MachineMemOperand() without a narrowing type conversion warning. llvm-svn: 329602
* [demangler] Support for fold expressions.Erik Pilkington2018-04-091-3/+126
| | | | llvm-svn: 329601
* [demangler] Support for <data-member-prefix>.Erik Pilkington2018-04-091-0/+9
| | | | llvm-svn: 329600
* [demangler] Support for partially substituted sizeof....Erik Pilkington2018-04-091-1/+24
| | | | llvm-svn: 329599
* [GISel] Refactor MachineIRBuilder to allow transformations whileAditya Nandakumar2018-04-091-257/+250
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | building. https://reviews.llvm.org/D45067 This change attempts to do two things: 1) It separates out the state that is stored in the MachineIRBuilder(InsertionPt, MF, MRI, InsertFunction etc) into a separate object called MachineIRBuilderState. 2) Add the ability to constant fold operations while building instructions (optionally). MachineIRBuilder is now refactored into a MachineIRBuilderBase which contains lots of non foldable build methods and their implementation. Instructions which can be constant folded/transformed are now in a class called FoldableInstructionBuilder which uses CRTP to use the implementation of the derived class for buildBinaryOps. Additionally buildInstr in the derived class can be used to implement other kinds of transformations. Also because of separation of state, given a MachineIRBuilder in an API, if one wishes to use another MachineIRBuilder, a new one can be constructed from the state locally. For eg, void doFoo(MachineIRBuilder &B) { MyCustomBuilder CustomB(B.getState()); // Use CustomB for building. } reviewed by : aemerson llvm-svn: 329596
* [X86] Revert the SLM part of r328914.Craig Topper2018-04-091-1/+3
| | | | | | While it appears to be correct information based on Intel's optimization manual and Agner's data, it causes perf regressions on a couple of the benchmarks in our internal list. llvm-svn: 329593
* AMDGPU: enable 128-bit for local addr space under an optionMarek Olsak2018-04-095-12/+17
| | | | | | | | | | | | | | | | | Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329591
* AMDGPU: Initialize GlobalISel passesTom Stellard2018-04-091-0/+1
| | | | | | | | | | | | | | Summary: This fixes AMDGPU GlobalISel test failures when enabling the AMDGPU target without any other targets that use GlobalISel. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D45353 llvm-svn: 329588
* Support generic expansion of ordered vector reduction (PR36732)Simon Pilgrim2018-04-092-6/+41
| | | | | | | | | | Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order. This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1] Differential Revision: https://reviews.llvm.org/D45366 llvm-svn: 329585
* [MachineLICM] Re-enable hoisting of constant storesZaara Syeda2018-04-091-2/+9
| | | | | | | | | | | | This patch fixes an issue exposed on the SystemZ build bots when committing https://reviews.llvm.org/rL327856. The hoisting was temporarily disabled with an option. This patch now re-enables hoisting and checks that we only hoist a store instruction when all its operands are either constant caller preserved registers or immediates. Differential Revision: https://reviews.llvm.org/D45286 llvm-svn: 329577
* [CodeGen/AccelTable] Don't emit zero-CU name indexesPavel Labath2018-04-092-0/+6
| | | | | | | | | | | | | | | | | Summary: If an input DICompileUnit is completely empty (e.g., the result of running "clang -g" on an empty file), we don't bother emitting an empty DWARF CU. When we do that, we must make sure we don't also emit a DWARF v5 name index, as DWARF specifies that each index must reference at least one compilation unit. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45435 llvm-svn: 329575
* [MergeICmp] Update debug msg.NFCXin Tong2018-04-091-1/+1
| | | | llvm-svn: 329572
* [X86][MMX] Fix missing itinerary for PALIGNRSimon Pilgrim2018-04-091-4/+4
| | | | llvm-svn: 329568
* [X86][MMX] Fix missing itinerary for MOVQ2DQ instruction formatSimon Pilgrim2018-04-091-1/+1
| | | | llvm-svn: 329567
* [X86][MMX] Fix missing itinerary for CVTPI2PSSimon Pilgrim2018-04-091-4/+4
| | | | llvm-svn: 329565
* [MergeICmp] Split blocks that do other work.Xin Tong2018-04-091-19/+118
| | | | | | | | | | | | | | | Summary: We do not try to move the instructions and split the block till we know the blocks can be split, i.e. BCE-cmp-insts can be separated from non-BCE-cmp-insts. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44443 llvm-svn: 329564
* [AMDGPU][MC][GFX9] Added instructions s_mul_hi_*32, s_lshl*_add_u32Dmitry Preobrazhensky2018-04-091-0/+21
| | | | | | | | | | | See bugs 36841: https://bugs.llvm.org/show_bug.cgi?id=36841 36842: https://bugs.llvm.org/show_bug.cgi?id=36842 Differential Revision: https://reviews.llvm.org/D45251 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329562
* [X86][MMX] Fix flipped reg/mem typo in MMX_MISC_FUNC_ITINSSimon Pilgrim2018-04-091-1/+1
| | | | | | The RR/RM itineraries were the wrong way around llvm-svn: 329561
* [X86][SSE] Fix f32 mul/div itinerary groups typoSimon Pilgrim2018-04-091-4/+4
| | | | | | The RM folded itineraries were incorrectly using the f64 version. llvm-svn: 329556
* [CodeGen/AccelTable]: Don't emit accelerator entries for functions with no namesPavel Labath2018-04-091-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We were emitting accelerator entries for functions with no name, which is contrary to the DWARF v5 spec: "All other (i.e., *not* DW_TAG_namespace) debugging information entries without a DW_AT_name attribute are excluded." Besides that, a name table entry with an empty string as a key is fairly useless. We can sometimes end up with functions which have a DW_AT_linkage_name but no DW_AT_name. One such example is the global-constructor-initialization functions, which C++ compilers synthesize for each compilation unit with global constructors. A very strict reading of the DWARF v5 spec would suggest that we should not even emit the accelerator entry for the linkage name in this case, but I don't think we should go that far. I found this when running the dwarf verifier over llvm codebase compiled with DWARF v5 accelerator tables. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: vleschuk, clayborg, echristo, probinson, llvm-commits Differential Revision: https://reviews.llvm.org/D45367 llvm-svn: 329552
* [DAGCombine] Improve ReduceLoad for SRLSam Parker2018-04-091-4/+34
| | | | | | | | | | | | | Recommitting r329283, third time lucky... If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 329551
* [X86] Merge some of the autoupgrade handling for masked intrinsics that just ↵Craig Topper2018-04-091-170/+149
| | | | | | | | need to upgrade to an unmasked version plus a select. NFCI These are were previously grouped in small groups of similarish intrinsics. But all the intrinsics have the same number of arguments and the same order. So we can move them all into a larger group for handling. llvm-svn: 329549
* [IRCE] Relax restriction on collected range checksMax Kazantsev2018-04-091-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In IRCE, we have a very old legacy check that works when we collect comparisons that we treat as range checks. It ensures that the value against which the indvar is compared is loop invariant and is also positive. This latter condition remained there since the times when IRCE was only able to handle signed latch comparison. As the optimization evolved, it now learned how to intersect signed or unsigned ranges, and this logic has no reliance on the fact that the right border of each range should be positive. The old implementation of this non-negativity check was also naive enough and just looked into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this check did not allow to deal with the most simple case that looks like follows: int size; // not known non-negative int length; //known non-negative; i = 0; if (size != 0) { do { range_check(i < size); range_check(i < length); ++i; } while (i < size) } In this case, even if from some dominating conditions IRCE could parse loop structure, it could only remove the range check against `length` and simply ignored the check against `size`. In this patch we remove this obsolete check. It will allow IRCE to pick comparison against `size` as a potential range check and then let Range Intersection logic decide whether it is OK to eliminate it or not. Differential Revision: https://reviews.llvm.org/D45362 Reviewed By: samparker llvm-svn: 329547
* [NFC] fix trivial typos in comments and error messageHiroshi Inoue2018-04-097-7/+7
| | | | | | "is is" -> "is", "are are" -> "are" llvm-svn: 329546
* Remove MachineLoopInfo dependency from AsmPrinter.Michael Zolotukhin2018-04-091-7/+24
| | | | | | | | | | | | | | | | | | | Summary: Currently MachineLoopInfo is used in only two places: 1) for computing IsBasicBlockInsideInnermostLoop field of MCCodePaddingContext, and it is never used. 2) in emitBasicBlockLoopComments, which is called only if `isVerbose()` is true. Despite that, we currently have a dependency on MachineLoopInfo, which makes pass manager to compute it and MachineDominator Tree. This patch removes the use (1) and makes the use (2) lazy, thus avoiding some redundant recomputations. Reviewers: opaparo, gadi.haber, rafael, craig.topper, zvi Subscribers: rengolin, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44812 llvm-svn: 329542
OpenPOWER on IntegriCloud