summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [SDA] Bug fix: Use IPD outside the loop as divergence boundNicolai Haehnle2019-04-181-9/+19
| | | | | | | | | | | | | | | | | | Summary: The immediate post dominator of the loop header may be part of the divergent loop. Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: mmasten, arsenm, jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59042 llvm-svn: 358681
* Fix a bug in SCEV's isSafeToExpand around speculation safetyPhilip Reames2019-04-181-1/+19
| | | | | | | | | | | | | | | | | isSafeToExpand was making a common, but dangerously wrong, mistake in assuming that if any instruction within a basic block executes, that all instructions within that block must execute. This can be trivially shown to be false by considering the following small example: bb: add x, y <-- InsertionPoint call @throws() udiv x, y <-- SCEV* S br ... It's clearly not legal to expand S above the throwing call, but the previous logic would do so since S dominates (but not properlyDominates) the block containing the InsertionPoint. Since iterating instructions w/in a block is expensive, this change special cases two cases: 1) S is an operand of InsertionPoint, and 2) InsertionPoint is the terminator of it's block. These two together are enough to keep all current optimizations triggering while fixing the latent correctness issue. As best I can tell, this is a silent bug in current ToT. Given that, there's no tests with this change. It was noticed in an upcoming optimization change (D60093), and was reviewed as part of that. That change will include the test which caused me to notice the issue. I'm submitting this seperately so that anyone bisecting a problem gets a clear explanation. llvm-svn: 358680
* MinidumpYAML: Fix ambiguity between std::make_unique and llvm::make_uniqueBenjamin Kramer2019-04-181-1/+1
| | | | llvm-svn: 358673
* MinidumpYAML: Add support for ModuleList streamPavel Labath2019-04-181-12/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds support for yaml (de)serialization of the minidump ModuleList stream. It's a fairly straight forward-application of the existing patterns to the ModuleList structures defined in previous patches. One thing, which may be interesting to call out explicitly is the addition of "new" allocation functions to the helper BlobAllocator class. The reason for this was, that there was an emerging pattern of a need to allocate space for entities, which do not have a suitable lifetime for use with the existing allocation functions. A typical example of that was the "size" of various lists, which is only available as a temporary returned by the .size() method of some container. For these cases, one can use the new set of allocation functions, which will take a temporary object, and store it in an allocator-managed buffer until it is written to disk. Reviewers: amccarth, jhenderson, clayborg, zturner Subscribers: lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60405 llvm-svn: 358672
* [X86][SSE] Lower ICMP EQ(AND(X,C),C) -> SRA(SHL(X,LOG2(C)),BW-1) iff C is ↵Simon Pilgrim2019-04-181-37/+21
| | | | | | | | | | | | | | power-of-2. This replaces the MOVMSK combine introduced at D52121/rL342326 (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) with the more general icmp lowering so it can pick up more cases through bitcasts - notably vXi8 cases which use vXi16 shifts+masks, this patch can remove the mask and use pcmpgtb(0,x) for the sra. Differential Revision: https://reviews.llvm.org/D60625 llvm-svn: 358651
* [NewPM] Add Option handling for LoopVectorizeSerguei Katkov2019-04-182-1/+26
| | | | | | | | | | | This patch enables passing options to LoopVectorizePass via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60681 llvm-svn: 358647
* [PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS()Kang Zhang2019-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177 When the two operands for BUILD_VECTOR are same, we will get assert error. llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&): Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) && "The loads cannot be both consecutive and reverse consecutive."' failed. This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We should use `getScalarType().getStoreSize();` to get the ElemSize instread of `getScalarSizeInBits() / 8`. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60811 llvm-svn: 358644
* Elaborate why we have an option on by default for enabling chr.Eric Christopher2019-04-182-0/+4
| | | | llvm-svn: 358641
* [AMDGPU] Avoid DAG combining assert with fneg(fadd(A,0))Tim Renouf2019-04-181-0/+10
| | | | | | | | | | | fneg combining attempts to turn it into fadd(fneg(A), fneg(0)), but creating the new fadd folds to just fneg(A). When A has multiple uses, this confuses it and you get an assert. Fixed. Differential Revision: https://reviews.llvm.org/D60633 Change-Id: I0ddc9b7286abe78edc0cd8d734fdeb05ff09821c llvm-svn: 358640
* Fix a typo in comments. [NFC]Ali Tamur2019-04-181-1/+1
| | | | llvm-svn: 358639
* [GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast ↵Aditya Nandakumar2019-04-181-1/+1
| | | | | | | | | | instructions https://reviews.llvm.org/D60844 Use the style of buildInstr that allows CSEing. llvm-svn: 358637
* Fix bad compare function over FusionCandidate.Richard Trieu2019-04-181-6/+8
| | | | | | | Reverse the checking of the domiance order so that when a self compare happens, it returns false. This makes compare function have strict weak ordering. llvm-svn: 358636
* Revert Implement sys::fs::copy_file using the macOS copyfile(3) API to ↵Adrian Prantl2019-04-182-53/+0
| | | | | | | | | support APFS clones. This reverts r358628 (git commit 91a06bee788262a294527b815354f380d99dfa9b) while investigating a crash reproducer bot failure. llvm-svn: 358634
* Implement sys::fs::copy_file using the macOS copyfile(3) APIAdrian Prantl2019-04-182-0/+53
| | | | | | | | | | | | | | | to support APFS clones. This patch adds a Darwin-specific implementation of llvm::sys::fs::copy_file() that uses the macOS copyfile(3) API to support APFS copy-on-write clones, which should be faster and much more space efficient. https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/ToolsandAPIs/ToolsandAPIs.html Differential Revision: https://reviews.llvm.org/D60802 llvm-svn: 358628
* Fix formatting. NFCAkira Hatanaka2019-04-171-90/+88
| | | | llvm-svn: 358623
* [x86] try to widen 'shl' as part of LEA formationSanjay Patel2019-04-171-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | The test file has pairs of tests that are logically equivalent: https://rise4fun.com/Alive/2zQ %t4 = and i8 %t1, 8 %t5 = zext i8 %t4 to i16 %sh = shl i16 %t5, 2 %t6 = add i16 %sh, %t0 => %t4 = and i8 %t1, 8 %sh2 = shl i8 %t4, 2 %z5 = zext i8 %sh2 to i16 %t6 = add i16 %z5, %t0 ...so if we can fold the shift op into LEA in the 1st pattern, then we should be able to do the same in the 2nd pattern (unnecessary 'movzbl' is a separate bug I think). We don't want to do this any sooner though because that would conflict with generic transforms that try to narrow the width of the shift. Differential Revision: https://reviews.llvm.org/D60789 llvm-svn: 358622
* Test commit by Denis BakhvalovDenis Bakhvalov2019-04-171-1/+1
| | | | | Change-Id: I4d85123a157d957434902fb14ba50926b2d56212 llvm-svn: 358619
* [AsmPrinter] hoist %a output template to base class for ARM+Aarch64Nick Desaulniers2019-04-173-18/+11
| | | | | | | | | | | | | | | | | | | | | Summary: X86 is quite complicated; so I intend to leave it as is. ARM+Aarch64 do basically the same thing (Aarch64 did not correctly handle immediates, ARM has a test llvm/test/CodeGen/ARM/2009-04-06-AsmModifier.ll that uses %a with an immediate) for a flag that should be target independent anyways. Reviewers: echristo, peter.smith Reviewed By: echristo Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60841 llvm-svn: 358618
* Add a getSizeInBits() accessor to MachineMemOperand. NFC.Amara Emerson2019-04-172-8/+8
| | | | | | | | Cleans up a bunch of places where we do getSize() * 8. Differential Revision: https://reviews.llvm.org/D60799 llvm-svn: 358617
* [GlobalISel] Add legalization support for non-power-2 loads and storesAmara Emerson2019-04-172-11/+99
| | | | | | | | | | Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613
* Add basic loop fusion pass.Kit Barton2019-04-175-0/+1219
| | | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607
* [AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFCNick Desaulniers2019-04-174-12/+7
| | | | | | | | | | | | | | | | | | | Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 llvm-svn: 358603
* [ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbolsSteven Wu2019-04-172-51/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool. ThinLTOCodeGenerator currently does not preserve llvm.used symbols and it can internalize them. In order to pass the necessary information to the legacy ThinLTOCodeGenerator, the input to the code generator is rewritten to be based on lto::InputFile. Now ThinLTO using the legacy LTO API will requires data layout in Module. "internalize" thinlto action in llvm-lto is updated to run both "promote" and "internalize" with the same configuration as ThinLTOCodeGenerator. The old "promote" + "internalize" option does not produce the same output as ThinLTOCodeGenerator. This fixes: PR41236 rdar://problem/49293439 Reviewers: tejohnson, pcc, kromanova, dexonsmith Reviewed By: tejohnson Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60421 llvm-svn: 358601
* [InstCombine] Factor out unreachable inst idiom creation [NFC]Philip Reames2019-04-173-13/+15
| | | | | | | | In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later. This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places. llvm-svn: 358600
* [LVI][CVP] Constrain values in with.overflow branchesNikita Popov2019-04-171-0/+27
| | | | | | | | | | | | If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1) then we can constrain the value of %x inside the branch based on makeGuaranteedNoWrapRegion(). We do this by extending the edge-value handling in LVI. This allows CVP to then fold comparisons against %x, as illustrated in the tests. Differential Revision: https://reviews.llvm.org/D60650 llvm-svn: 358597
* [AMDGPU][MC] Corrected handling of "-" before expressionsDmitry Preobrazhensky2019-04-171-38/+58
| | | | | | | | | | See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60622 llvm-svn: 358596
* AMDGPU: Force skip over SMRD, VMEM and s_waitcnt instructionsRhys Perry2019-04-171-0/+4
| | | | | | | | | | | | | | | | Summary: This fixes a large Dawn of War 3 performance regression with RADV from Mesa 19.0 to master which was caused by creating less code in some branches. Reviewers: arsen, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60824 llvm-svn: 358592
* [LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.Florian Hahn2019-04-171-2/+13
| | | | | | | | | | | | | | | | | | | | Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586
* [DAGCombine] Add SimplifyDemandedBits helper that handles demanded elts mask ↵Simon Pilgrim2019-04-171-4/+13
| | | | | | | | as well The other SimplifyDemandedBits helpers become wrappers to this new demanded elts variant. llvm-svn: 358585
* [Support] Add LEB128 support to BinaryStreamReader/Writer.Lang Hames2019-04-172-1/+45
| | | | | | | | | | | | | | | | | | | Summary: This patch adds support for ULEB128 and SLEB128 encoding and decoding to BinaryStreamWriter and BinaryStreamReader respectively. Support for ULEB128/SLEB128 will be used for eh-frame parsing in the JITLink library currently under development (see https://reviews.llvm.org/D58704). Reviewers: zturner, dblaikie Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60810 llvm-svn: 358584
* [ScheduleDAGRRList] Recompute topological ordering on demand.Florian Hahn2019-04-172-24/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently there is a single point in ScheduleDAGRRList, where we actually query the topological order (besides init code). Currently we are recomputing the order after adding a node (which does not have predecessors) and then we add predecessors edge-by-edge. We can avoid adding edges one-by-one after we added a new node. In that case, we can just rebuild the order from scratch after adding the edges to the DAG and avoid all the updates to the ordering. Also, we can delay updating the DAG until we query the DAG, if we keep a list of added edges. Depending on the number of updates, we can either apply them when needed or recompute the order from scratch. This brings down the geomean compile time for of CTMark with -O1 down 0.3% on X86, with no regressions. Reviewers: MatzeB, atrick, efriedma, niravd, paquette Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60125 llvm-svn: 358583
* [AMDGPU][MC] Corrected parsing of registersDmitry Preobrazhensky2019-04-171-27/+126
| | | | | | | | | | See bug 41280: https://bugs.llvm.org/show_bug.cgi?id=41280 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60621 llvm-svn: 358581
* [AMDGPU] Flag new raw/struct atomic ops as source of divergenceTim Renouf2019-04-171-0/+22
| | | | | | | Differential Revision: https://reviews.llvm.org/D60731 Change-Id: I821d93dec8b9cdd247b8172d92fb5e15340a9e7d llvm-svn: 358579
* [LLVM-C] Add DIFile Field AccesssorsRobert Widmann2019-04-171-0/+25
| | | | | | | | | | | | | | | | | | | Summary: Add accessors for the file, directory, source file name (curiously, an `Optional` value?), of a DIFile. This is intended to replace the LLVMValueRef-based accessors used in D52239 Reviewers: whitequark, jberdine, deadalnix Reviewed By: whitequark, jberdine Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60489 llvm-svn: 358577
* [CostModel][X86] Add bool anyof/allof reduction costsSimon Pilgrim2019-04-171-0/+42
| | | | | | | | On pre-AVX512 targets we can use MOVMSK to extract reduced boolean results. This is properly optimized, annoyingly AVX512 isn't and produces code that is almost as bad as the (unchanged) costs suggest...... Differential Revision: https://reviews.llvm.org/D60403 llvm-svn: 358574
* [DWARF] llvm::Error -> Error. NFCFangrui Song2019-04-171-4/+5
| | | | | | The unqualified name is more common and is used in the file as well. llvm-svn: 358567
* Change some llvm::{lower,upper}_bound to llvm::bsearch. NFCFangrui Song2019-04-174-17/+10
| | | | llvm-svn: 358564
* [CVP] processOverflowIntrinsic(): don't crash if constant-holding happenedRoman Lebedev2019-04-171-4/+7
| | | | | | | As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559
* [DWARF] Pass ReferenceToDIEOffsets elements by referenceFangrui Song2019-04-171-3/+3
| | | | llvm-svn: 358558
* [X86] In CopyToFromAsymmetricReg, use VR128 instead of FR32 instructions for ↵Craig Topper2019-04-171-12/+12
| | | | | | | | | | | | | GR32<->XMM register copies. We have two versions of some instructions, VR128 versions and FR32 versions that are marked as CodeGenOnly. This change switches to using the VR128 versions for these copies. It's after register allocation so the class size no longer matters. This matches how GR64 works. llvm-svn: 358555
* Revert "Add basic loop fusion pass." Per request.Eric Christopher2019-04-175-1216/+0
| | | | | | This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-175-0/+1216
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Remove the run-slp-after-loop-vectorization option.Eric Christopher2019-04-171-12/+3
| | | | | | | It's been on by default for 4 years and cleans up the pass hierarchy. llvm-svn: 358548
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-175-1216/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* Add basic loop fusion pass.Kit Barton2019-04-175-0/+1216
| | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Phabricator: https://reviews.llvm.org/D55851 llvm-svn: 358543
* [LLVM-C] Add Accessors For Global Variable Metadata PropertiesRobert Widmann2019-04-161-0/+25
| | | | | | | | | | | | | | | | Summary: Metadata for a global variable is really a (GlobalVariable, Expression) tuple. Allow access to these, then allow retrieving the file, scope, and line for a DIVariable, whether global or local. This should be the last of the accessors required for uniform access to location and file information metadata. Reviewers: jberdine, whitequark, deadalnix Reviewed By: jberdine, whitequark Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60725 llvm-svn: 358532
* Fix a typo in comments. [NFC]Ali Tamur2019-04-161-1/+1
| | | | llvm-svn: 358531
* [NVPTXAsmPrinter] clean up dead code. NFCNick Desaulniers2019-04-162-45/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The printOperand function takes a default parameter, for which there are zero call sites that explicitly pass such a parameter. As such, there is no case to support. This means that the method printVecModifiedImmediate is purly dead code, and can be removed. The eventual goal for some of these AsmPrinter refactoring is to have printOperand be a virtual method; making it easier to print operands from the base class for more generic Asm printing. It will help if all printOperand methods have the same function signature (ie. no Modifier argument when not needed). Reviewers: echristo, tra Reviewed By: echristo Subscribers: jholewinski, hiraditya, llvm-commits, craig.topper, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60727 llvm-svn: 358527
* [TargetLowering] Rename preferShiftsToClearExtremeBits and ↵Simon Pilgrim2019-04-167-13/+12
| | | | | | | | | | | | shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair llvm-svn: 358526
* [EarlyCSE] detect equivalence of selects with inverse conditions and ↵Sanjay Patel2019-04-161-2/+59
| | | | | | | | | | | | | | | | | | commuted operands (PR41101) This is 1 of the problems discussed in the post-commit thread for: rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html and filed as: https://bugs.llvm.org/show_bug.cgi?id=41101 Instcombine tries to canonicalize some of these cases (and there's room for improvement there independently of this patch), but it can't always do that because of extra uses. So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar to how we detect commuted compares and commuted min/max/abs. Differential Revision: https://reviews.llvm.org/D60723 llvm-svn: 358523
OpenPOWER on IntegriCloud