summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU/GlobalISel: Cleanup constant legalityMatt Arsenault2018-03-171-8/+5
| | | | llvm-svn: 327774
* AMDGPU/GlobalISel: Basic G_GEP legalityMatt Arsenault2018-03-171-4/+18
| | | | llvm-svn: 327773
* AMDGPU/GlobalISel: Basic legality for load/storeMatt Arsenault2018-03-171-14/+39
| | | | llvm-svn: 327772
* [X86] Added support for nocf_check attribute for indirect Branch TrackingOren Ben Simhon2018-03-1722-54/+167
| | | | | | | | | | | | | | | X86 Supports Indirect Branch Tracking (IBT) as part of Control-Flow Enforcement Technology (CET). IBT instruments ENDBR instructions used to specify valid targets of indirect call / jmp. The `nocf_check` attribute has two roles in the context of X86 IBT technology: 1. Appertains to a function - do not add ENDBR instruction at the beginning of the function. 2. Appertains to a function pointer - do not track the target function of this pointer by adding nocf_check prefix to the indirect-call instruction. This patch implements `nocf_check` context for Indirect Branch Tracking. It also auto generates `nocf_check` prefixes before indirect branchs to jump tables that are guarded by range checks. Differential Revision: https://reviews.llvm.org/D41879 llvm-svn: 327767
* [SystemZ] computeKnownBitsForTargetNode() / ComputeNumSignBitsForTargetNode()Jonas Paulsson2018-03-172-19/+295
| | | | | | | | | | | Improve/implement these methods to improve DAG combining. This mainly concerns intrinsics. Some constant operands to SystemZISD nodes have been marked Opaque to avoid transforming back and forth between generic and target nodes infinitely. Review: Ulrich Weigand llvm-svn: 327765
* [SelectionDAG] Handle big endian target BITCAST in computeKnownBits()Jonas Paulsson2018-03-171-6/+5
| | | | | | | | | | | | | | | The BITCAST handling in computeKnownBits() previously only worked for little endian. This patch reverses the iteration over elements for a big endian target which allows this to work in this case also. SystemZ test case. Review: Eli Friedman https://reviews.llvm.org/D44249 llvm-svn: 327764
* [GlobalsAA] Fix a pretty terrible bug that has been in GlobalsAA forChandler Carruth2018-03-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | a long time. The key thing is that we need to create value handles for every function that we create a `FunctionInfo` object around. Without this, when that function is deleted we can end up creating a new function that collides with its address and look up a stale AA result. With that AA result we can in turn miscompile code in ways that break. This is seriously one of the most absurd miscompiles I've seen. It only reproduced for us recently and only when building a very large server with both ThinLTO and PGO. A *HUGE* shout out to Wei Mi who tracked all of this down and came up with this patch. I'm just landing it because I happened to still by at a computer. He or I can work on crafting a test case to hit this (now that we know what to target) but it'll take a while, and we've been chasing this for a long time and need it fix Right Now. llvm-svn: 327761
* [MachineOutliner] Make KILLs invisibleJessica Paquette2018-03-162-0/+10
| | | | | | | | | At the point the outliner runs, KILLs don't impact anything, but they're still considered unique instructions. This commit makes them invisible like DebugValues so that they can still be outlined without impacting outlining decisions. llvm-svn: 327760
* Quiet unused variable warnings. NFC.David L Kreitzer2018-03-161-0/+3
| | | | | | Differential revision: https://reviews.llvm.org/D44583 llvm-svn: 327745
* [X86] Pass SelectionDAG into X86ISelAddressMode::dump and on to SDNode::dump.Craig Topper2018-03-161-4/+4
| | | | | | This prevents a crash in SelectionDAGDumper with -debug when trying to print mem operands if one of the registers in the addressing mode comes from a load. llvm-svn: 327744
* [Hexagon] Avoid bank conflicts in post-RA schedulerKrzysztof Parzyszek2018-03-162-4/+14
| | | | | | | | | | Avoid scheduling two loads in such a way that they would end up in the same packet. If there is a load in a packet, try to schedule a non-load next. Patch by Brendon Cahoon. llvm-svn: 327742
* [IR] Avoid the need to prefix MS C++ symbols with '\01'Reid Kleckner2018-03-161-2/+10
| | | | | | | | | | | | | | | | | | | | Now the Windows mangling modes ('w' and 'x') do not do any mangling for symbols starting with '?'. This means that clang can stop adding the hideous '\01' leading escape. This means LLVM debug logs are less likely to contain ASCII escape characters and it will be easier to copy and paste MS symbol names from IR. Finally. For non-Windows platforms, names starting with '?' still get IR mangling, so once clang stops escaping MS C++ names, we will get extra '_' prefixing on MachO. That's fine, since it is currently impossible to construct a triple that uses the MS C++ ABI in clang and emits macho object files. Differential Revision: https://reviews.llvm.org/D7775 llvm-svn: 327734
* Revert r327721 "This patch fixes the invalid usage of OptSize in Machine ↵Reid Kleckner2018-03-161-3/+3
| | | | | | | | | Combiner." It causes asserts when compiling Chromium on Win32 with optimizations. We compile many things with -Os. llvm-svn: 327733
* [X86] Merge ADDSUB/SUBADD detection into single methods that can detect ↵Craig Topper2018-03-161-101/+74
| | | | | | | | | | either and indicate what they found. Previously, we called the same functions twice with a bool flag determining whether we should look for ADDSUB or SUBADD. It would be more efficient to run the code once and detect either pattern with a flag to tell which type it found. Differential Revision: https://reviews.llvm.org/D44540 llvm-svn: 327730
* [CorrelatedValuePropagation] Use ↵Craig Topper2018-03-161-3/+3
| | | | | | SelectInst::getCondition/getTrueValue/getFalseValue instead of getOperand for readability. NFC llvm-svn: 327728
* [AMDGPU] Supported ds_write_b128 generation.Farhana Aleen2018-03-164-6/+16
| | | | | | | | | | | | | | Summary: This is a follow-on patch of https://reviews.llvm.org/D44210 Author: FarhanaAleen Reviewed By: msearles Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44319 llvm-svn: 327726
* [X86] Post process the DAG after isel to remove vector moves that were added ↵Craig Topper2018-03-162-61/+65
| | | | | | | | | | | | to zero upper bits. We previously avoided inserting these moves during isel in a few cases which is implemented using a whitelist of opcodes. But it's too difficult to generate a perfect list of opcodes to whitelist. Especially with AVX512F without AVX512VL using 512 bit vectors to implement some 128/256 bit operations. Since isel is done bottoms up, we'd have to check the VT and opcode and subtarget in order to determine whether an EXTRACT_SUBREG would be generated for some operations. So instead of doing that, this patch adds a post processing step that detects when the moves are unnecesssary after isel. At that point any EXTRACT_SUBREGs would have already been created and appear in the DAG. So then we just need to ensure the input to the move isn't one. Differential Revision: https://reviews.llvm.org/D44289 llvm-svn: 327724
* [AMDGPU][MC][GFX8][GFX9][DISASSEMBLER] Added "_e32" suffix to 32-bit VINTRP ↵Dmitry Preobrazhensky2018-03-164-6/+23
| | | | | | | | | | | opcodes See bug 36751: https://bugs.llvm.org/show_bug.cgi?id=36751 Differential Revision: https://reviews.llvm.org/D44529 Reviewers: artem.tamazov, arsenm llvm-svn: 327723
* [LICM/mustexec] Extend first iteration must execute logic to fcmpsPhilip Reames2018-03-161-10/+9
| | | | | | | | | | This builds on the work from https://reviews.llvm.org/D44287. It turned out supporting fcmp was much easier than I realized, so let's do that now. As an aside, our -O3 handling of a floating point IVs leaves a lot to be desired. We do convert the float IV to an integer IV, but do so late enough that many other optimizations are missed (e.g. we don't vectorize). Differential Revision: https://reviews.llvm.org/D44542 llvm-svn: 327722
* This patch fixes the invalid usage of OptSize in Machine Combiner.Andrew V. Tischenko2018-03-161-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D43813 llvm-svn: 327721
* [AMDGPU][MC] Corrected default values for unused SDWA operandsDmitry Preobrazhensky2018-03-162-10/+10
| | | | | | | | | See bug 36355: https://bugs.llvm.org/show_bug.cgi?id=36355 Differential Revision: https://reviews.llvm.org/D44481 Reviewers: artem.tamazov, arsenm llvm-svn: 327720
* [SystemZ] Make AnyRegBitRegClass unallocatable.Jonas Paulsson2018-03-161-1/+1
| | | | | | | | AnyReg is just for the assembler and it is better to have it as not allocatable in order to simplify (make more intuitive) the RegPressureSets. Review: Ulrich Weigand llvm-svn: 327715
* [JumpThreading] Track unreachable BBs to avoid processingBrian M. Rzycki2018-03-161-47/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | JumpThreading iterates over F until the IR quiesces. Transforming unreachable BBs increases compile time and it is also possible to never stabilize causing JumpThreading to hang. An older attempt at fixing this problem was D3991 where removeUnreachableBlocks(F) was called before JumpThreading began. This has a few drawbacks: * expensive - the routine attempts to fix up the IR to identify additional BBs that can be removed along with unreachable BBs. * aggressive - does not identify and preserve the shape of the IR. At a minimum it does not preserve loop hierarchies. * invasive - altering reachable blocks it may disrupt IR shapes that could have otherwise been JumpThreaded. This patch avoids removeUnreachableBlocks(F) and instead tracks unreachable BBs in a SmallPtrSet using DominatorTree to validate the initial state of all BBs. We then rely on subsequent passes to identify and remove these unreachable blocks from F. Reviewers: dberlin, sebpop, kuhar, dinesh.d Reviewed by: sebpop, kuhar Subscribers: hiraditya, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D44177 llvm-svn: 327713
* [Hexagon] Fix zero-extending non-HVX bool vectorsKrzysztof Parzyszek2018-03-162-12/+27
| | | | llvm-svn: 327712
* [ARM] Convert more invalid NEON immediate loadsMikhail Maltsev2018-03-162-106/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the LLVM MC assembler is able to convert e.g. vmov.i32 d0, #0xabababab (which is technically invalid) into a valid instruction vmov.i8 d0, #0xab this patch adds support for vmov.i64 and for cases with the resulting load types other than i8, e.g.: vmov.i32 d0, #0xab00ab00 -> vmov.i16 d0, #0xab00 Reviewers: olista01, rengolin Reviewed By: rengolin Subscribers: rengolin, javed.absar, kristof.beyls, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D44467 llvm-svn: 327709
* [X86][Btver2] Add correct mul/imul schedule costsSimon Pilgrim2018-03-161-1/+14
| | | | | | Integer multiply is performed on the JMul function unit and i64 requires double pumping llvm-svn: 327707
* [X86][Btver2] Add correct lzcnt/tzcnt/popcnt schedule costsSimon Pilgrim2018-03-161-0/+23
| | | | | | Don't use WriteIMul defaults llvm-svn: 327706
* [ARM] Fix a check in vmov/vmvn immediate parsingMikhail Maltsev2018-03-161-20/+13
| | | | | | | | | | | | | | | | | | | | Summary: Currently the check is incorrect and the following invalid instruction is accepted and incorrectly assembled: vmov.i32 d2, #0x00a500a6 This patch fixes the issue. Reviewers: olista01, rengolin Reviewed By: rengolin Subscribers: SjoerdMeijer, javed.absar, rogfer01, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44460 llvm-svn: 327704
* [AArch64] Implement getArithmeticReductionCostMatthew Simpson2018-03-162-0/+31
| | | | | | | | | | This patch provides an implementation of getArithmeticReductionCost for AArch64. We can specialize the cost of add reductions since they are computed using the 'addv' instruction. Differential Revision: https://reviews.llvm.org/D44490 llvm-svn: 327702
* DWARFVerifier: Enhance validation of .debug_names hash tablesPavel Labath2018-03-161-15/+116
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds more checks to the .debug_names validator. Specifically, they check for: - buckets claiming to be non-empty but pointing to mismatched hashes (most consumers would interpret this as an empty bucket, but it questionable whether the generator meant that) - hashes that are not reachable from any bucket - names with incorrect hashes Together, these checks ensure that any name in the index can be reached through the hash table using the regular lookup algorithm. We also warn if we encounter a name index without a hash table. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44433 llvm-svn: 327699
* [ARM] FP16 codegen support for VSELSjoerd Meijer2018-03-162-2/+3
| | | | | | | | | This implements lowering of SELECT_CC for f16s, which enables codegen of VSEL with f16 types. Differential Revision: https://reviews.llvm.org/D44518 llvm-svn: 327695
* [NFC] Void variables used for asserts onlyMax Kazantsev2018-03-161-0/+2
| | | | llvm-svn: 327693
* [X86][Btver2] Add support for multiple pipelines stages for x86 scalar ↵Simon Pilgrim2018-03-151-22/+11
| | | | | | | | schedules. NFCI. This allows us to use JWriteResIntPair for complex schedule classes (like WriteIDiv) as well as single pipe instructions. llvm-svn: 327686
* [SelectionDAG][ARM][X86] Teach PromoteIntRes_SETCC to do a better job ↵Craig Topper2018-03-152-24/+15
| | | | | | | | | | | | | | | | picking the result type for the setcc. Previously if getSetccResultType returned an illegal type we just fell back to using the default promoted type. This appears to have been to handle the case where for vectors getSetccResultType returns the input type, but the input type itself isn't legal and will need to be promoted. Without the legality check we would never reach a legal type. But just picking the promoted type to be the setcc type can create strange setccs where the result type is 128 bits and the operand type is 256 bits. If for example the result type was promoted to v8i16 from v8i1, but the input type was promoted from v8i23 to v8i32. We currently handle this with custom lowering code in X86. This legality check also caused us reject the getSetccResultType when the input type needed to be widened or split. Even though that result wouldn't have caused legalization to get stuck. This patch tries to fix this by detecting the getSetccResultType needs to be promoted. If its input type also needs to be promoted we'll try a ask for a new setcc result type based on its eventual promoted value. Otherwise we fall back to default type to promote to. For any other illegal values we might get back from the initial call to getSetccResultType we just keep and allow it to be re-legalized later via splitting or widening or scalarizing. llvm-svn: 327683
* [X86][Btver2] Fix ymm div/sqrt to use fmul unitSimon Pilgrim2018-03-151-12/+12
| | | | | | | | YMM FDiv/FSqrt are dispatched on pipe JFPU1 but should be performed on the JFPM unit - that is where most of the cycles are spent. This matches the pipes for WriteFSqrt/WriteFDiv definitions. llvm-svn: 327682
* Use standard `print(dbgs())` pattern to implement DebugLoc::dumpSean Silva2018-03-151-13/+1
| | | | | | The open-coded implementation had a bug. It didn't print filenames. llvm-svn: 327681
* [PDB] Fix a bug where we were serializing hash tables incorrectly.Zachary Turner2018-03-151-3/+5
| | | | | | | | | | | There was some code that tried to calculate the number of 4-byte words required to hold N bits, but it was instead computing the number of bytes required to hold N bits. This was leading to extraneous data being output into the hash table, which would cause certain operations in DIA (the Microsoft PDB reader) to fail. llvm-svn: 327675
* [WebAssembly] Add DebugLoc information to WebAssembly block and loop.Derek Schuff2018-03-152-8/+21
| | | | | | | Patch by Yury Delendik Differential Revision: https://reviews.llvm.org/D44448 llvm-svn: 327673
* [NVPTX] TblGen-ized lowering of WMMA intrinsics.Artem Belevich2018-03-155-620/+155
| | | | | | | | NFC. Differential Revision: https://reviews.llvm.org/D43151 llvm-svn: 327672
* [LoopUnroll] Peel off iterations if it makes conditions true/false.Florian Hahn2018-03-152-5/+90
| | | | | | | | | | | | | | | If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations. This improves codegen for PR34364. Reviewers: mkuper, mkazantsev, efriedma Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43876 llvm-svn: 327671
* Re-land r327620 "[CodeView] Initial support for emitting S_BLOCK32 symbols ↵Reid Kleckner2018-03-153-7/+179
| | | | | | | | | for lexical scopes" This is safe to land now that we don't copy FunctionInfo when rehashing the DenseMap. llvm-svn: 327670
* [codeview] Fix sense of the assertion about hashtable insertionReid Kleckner2018-03-151-1/+1
| | | | llvm-svn: 327669
* [codeview] Delete FunctionInfo copy ctor and move out of DenseMapReid Kleckner2018-03-152-5/+11
| | | | | | | | | | | | | We were unnecessarily copying a bunch of these FunctionInfo objects around when rehashing the DenseMap. Furthermore, r327620 introduced pointers referring to objects owned by FunctionInfo, and the default copy ctor did the wrong thing in this case, leading to use-after-free when the DenseMap gets rehashed. I will rebase r327620 on this next and recommit it. llvm-svn: 327665
* [LICM] Ignore exits provably not taken on first iteration when computing ↵Philip Reames2018-03-151-1/+60
| | | | | | | | | | | | | | | | must execute It is common to have conditional exits within a loop which are known not to be taken on some iterations, but not necessarily all. This patches extends our reasoning around guaranteed to execute (used when establishing whether it's safe to dereference a location from the preheader) to handle the case where an exit is known not to be taken on the first iteration and the instruction of interest *is* known to be taken on the first iteration. This case comes up in two major ways: * If we have a range check which we've been unable to eliminate, we frequently know that it doesn't fail on the first iteration. * Pass ordering. We may have a check which will be eliminated through some sequence of other passes, but depending on the exact pass sequence we might never actually do so or we might miss other optimizations from passes run before the check is finally eliminated. The initial version (here) is implemented via InstSimplify. At the moment, it catches a few cases, but misses a lot too. I added test cases for missing cases in InstSimplify which I'll follow up on separately. Longer term, we should probably wire SCEV through to here to get much smarter loop aware simplification of the first iteration predicate. Differential Revision: https://reviews.llvm.org/D44287 llvm-svn: 327664
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-151-2/+2
| | | | | | Fix typo. llvm-svn: 327663
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-151-0/+7
| | | | | | Add special case for rotate right. llvm-svn: 327662
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-152-12/+59
| | | | | | Increase the number of cheap as move cases of register reset. llvm-svn: 327661
* [X86] Make sure we use FSUB instruction as the reference for operand order ↵Craig Topper2018-03-151-4/+12
| | | | | | | | in isAddSubOrSubAdd when recognizing subadd The FADD part of the addsub/subadd pattern can have its operands commuted, but when checking for fsubadd we were using the fadd as reference and commuting the fsub node. llvm-svn: 327660
* Remove empty fileDavid Blaikie2018-03-151-13/+0
| | | | | | | I should've deleted this in r320768 but accidentally just deleted its contents instead. llvm-svn: 327658
* Revert r327620 "[CodeView] Initial support for emitting S_BLOCK32 symbols ↵Reid Kleckner2018-03-152-175/+7
| | | | | | | | | | for lexical scopes" It is causing crashes when compiling Chrome in debug mode. I'll try to debug it in a second. llvm-svn: 327657
OpenPOWER on IntegriCloud