summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Replace some C++ selection code with TableGen patterns. NFC.Eli Friedman2017-03-145-64/+33
| | | | | | Differential Revision: https://reviews.llvm.org/D30794 llvm-svn: 297768
* [Support] Make the SystemZ bot happy by using make_error_code.Juergen Ributzka2017-03-141-1/+2
| | | | | | | This should fix the last issue on the SystemZ bot related to the broken symlink test. llvm-svn: 297767
* [DAG] vector div/rem with any zero element in divisor is undefSanjay Patel2017-03-145-33/+35
| | | | | | | | | | | | | | | | This is the backend counterpart to: https://reviews.llvm.org/rL297390 https://reviews.llvm.org/rL297409 and follow-up to: https://reviews.llvm.org/rL297384 It surprised me that we need to duplicate the check in FoldConstantArithmetic and FoldConstantVectorArithmetic, but one or the other doesn't catch all of the test cases. There is an existing code comment about merging those someday. Differential Revision: https://reviews.llvm.org/D30826 llvm-svn: 297762
* SamplePGO ThinLTO ICP fix for local functions.Dehao Chen2017-03-149-6/+158
| | | | | | | | | | | | | | | | | | | | | | | | Summary: In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen: 1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName. 2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining: 1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import. 2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName. 3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote. Reviewers: mehdi_amini, tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30754 llvm-svn: 297757
* [InstCombine] improve readability; NFCISanjay Patel2017-03-141-29/+23
| | | | llvm-svn: 297755
* [InstCombine] consolidate rem tests and update checks; NFCSanjay Patel2017-03-145-138/+143
| | | | llvm-svn: 297747
* [InstCombine] regenerate checks; NFCSanjay Patel2017-03-141-177/+225
| | | | llvm-svn: 297746
* [Hexagon] Fix a condition in HexagonEarlyIfConv.cppKrzysztof Parzyszek2017-03-141-1/+1
| | | | | | This fixes llvm.org/PR32265. llvm-svn: 297745
* Fix typo in commentArtyom Skrobov2017-03-141-1/+1
| | | | llvm-svn: 297742
* [X86] Add extra BITREVERSE testsSimon Pilgrim2017-03-141-131/+442
| | | | | | | | Test on 32-bit and 64-bit targets. Add bitreverse tests for i64, i32 and i16 llvm-svn: 297741
* [LV] Refactor cross-iteration phi's back-patching; NFCGil Rapaport2017-03-141-232/+244
| | | | | | | | | | | | | | This patch refactors the PHisToFix loop as follows: - The loop itself now resides in its own method. - The new method iterates on scalar-loop's header; the PHIsToFix map formerly propagated as an output parameter and filled during phi widening is removed. - The code handling reductions is moved into its own method, similar to the existing fixFirstOrderRecurrence(). Differential Revision: https://reviews.llvm.org/D30755 llvm-svn: 297740
* [ARM] Diagnose ARM MOVT without :lower16: or :upper16: expressionOliver Stannard2017-03-142-0/+4
| | | | | | | | | | | This instruction was missing from the list of opcodes that we check, so we were hitting an llvm_unreachable in ARMMCCodeEmitter.cpp for the ARM MOVT instruction, rather than the diagnostic that is emitted for the other MOVW/MOVT instructions. Differential revision: https://reviews.llvm.org/D30936 llvm-svn: 297739
* De-duplicate the two implementations of ↵Artyom Skrobov2017-03-141-13/+5
| | | | | | | | | | | | ARMBaseInstrInfo::isProfitableToIfCvt() [NFC] Reviewers: congh, rengolin Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D30934 llvm-svn: 297738
* [LV] Refactor Cost Model's selectVectorizationFactor(); NFCAyal Zaks2017-03-141-73/+132
| | | | | | | | | | | Refactoring Cost Model's selectVectorizationFactor() so that it handles only the selection of the best VF from a pre-computed range of candidate VF's, extracting early-exit criteria and the computation of a MaxVF upper-bound to other methods, all driven by a newly introduced LoopVectorizationPlanner. Differential Revision: https://reviews.llvm.org/D30653 llvm-svn: 297737
* [X86][MMX] Update FIXME comment. NFCI.Simon Pilgrim2017-03-141-1/+1
| | | | llvm-svn: 297736
* Make PredIteratorCache size() logically const. Do not require copying ↵Daniel Berlin2017-03-141-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | predecessors to get size. Summary: Every single benchmark i can run, on large and small cfgs, fully connected, etc, across 3 different platforms (x86, arm., and PPC) says that the current pred iterator cache is a losing proposition. I can't find a case where it's faster than just walking preds, and in some cases, it's 5-10% slower. This is due to copying the preds. It also degrades into copying the entire cfg. The one operation that is occasionally faster is the cached size. This makes that operation faster by not relying on having the copies available. I'm not even sure that is faster enough to be worth it. I, again, have trouble finding cases where this takes long enough in a pass to be worth caching compared to a million other things they could cache or improve. My suggestion: We next remove the get() interface. We do stronger benchmarking of size(). We probably end up killing this entire cache. / Reviewers: chandlerc Subscribers: aemerson, llvm-commits, trentxintong Differential Revision: https://reviews.llvm.org/D30873 llvm-svn: 297733
* Test commit.James Henderson2017-03-141-1/+1
| | | | llvm-svn: 297731
* [CodeGen] Fix -Wreorder warning.Benjamin Kramer2017-03-141-3/+3
| | | | llvm-svn: 297729
* Fix typos in ADCE commentsTobias Grosser2017-03-141-7/+7
| | | | llvm-svn: 297726
* [ValueTracking] Out of range shifts might be undefOliver Stannard2017-03-142-0/+38
| | | | | | | | | | | | | | If it is possible for the RHS of a shift operation to be greater than or equal to the bit-width, then the result might be undef, and we can't report any known bits. In some cases, this was allowing a transformation in instcombine which widened an undef value from i1 to i32, increasing the range of values that a function could return. Differential revision: https://reviews.llvm.org/D30781 llvm-svn: 297724
* [ARM] Move SMULW[B|T] isel to DAG CombineSam Parker2017-03-148-150/+180
| | | | | | | | | | | | Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716
* Disable Callee Saved RegistersOren Ben Simhon2017-03-1419-89/+258
| | | | | | | | | | | | | | Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller. Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list. The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee. The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee. Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span). The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments. The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC. Differential Revision: https://reviews.llvm.org/D28566 llvm-svn: 297715
* [AVX-512] Use iPTR instead of i64 in patterns for ↵Craig Topper2017-03-142-33/+12
| | | | | | extract_subvector/insert_subvector index. llvm-svn: 297707
* [AVX-512] Add test cases that demonstrate some patterns that don't work ↵Craig Topper2017-03-141-51/+159
| | | | | | correctly in 32-bit mode. NFC llvm-svn: 297706
* [TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improvedJonas Paulsson2017-03-1413-89/+145
| | | | | | | | | | | | | | | | | | | | | getIntrinsicInstrCost() used to only compute scalarization cost based on types. This patch improves this so that the actual arguments are checked when they are available, in order to handle only unique non-constant operands. Tests updates: Analysis/CostModel/X86/arith-fp.ll Transforms/LoopVectorize/AArch64/interleaved_cost.ll Transforms/LoopVectorize/ARM/interleaved_cost.ll The improvement in getOperandsScalarizationOverhead() to differentiate on constants made it necessary to update the interleaved_cost.ll tests even though they do not relate to intrinsics. Review: Hal Finkel https://reviews.llvm.org/D29540 llvm-svn: 297705
* [AVX-512] Pre-emptively fix more places in fastisel where we might copy a ↵Craig Topper2017-03-141-9/+28
| | | | | | VK1 register into a AH/BH/CH/DH register. llvm-svn: 297704
* Add missing condprop-xfail.ll that contains the remaining xfail'd testsDaniel Berlin2017-03-141-0/+123
| | | | llvm-svn: 297699
* Recommitting Craig Topper's patch now that r296476 has been recommitted.Nirav Dave2017-03-142-255/+83
| | | | | | | | When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. llvm-svn: 297698
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2017-03-1475-2553/+2542
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 297695
* [libFuzzer] Reorder includes in testVitaly Buka2017-03-131-2/+2
| | | | llvm-svn: 297692
* [libFuzzer] Fix compilation of CustomCrossOverAndMutateTest on WindowsVitaly Buka2017-03-131-1/+2
| | | | llvm-svn: 297690
* Add the beginning of PDB diffing support.Zachary Turner2017-03-139-131/+526
| | | | | | | | | | For now this only diffs the stream directory and the MSF Superblock. Future patches will drill down into individual streams to find out where the differences lie. Differential Revision: https://reviews.llvm.org/D30908 llvm-svn: 297689
* Revert "Debug Info: Add basic support for external types references."Adrian Prantl2017-03-138-89/+3
| | | | | | | | | | | | | | This reverts commit r242302. External type refs of this form were never used by any LLVM frontend so this is effectively dead code. (They were introduced to support clang module debug info, but in the end we came up with a better design that doesn't use this feature at all.) rdar://problem/25897929 Differential Revision: https://reviews.llvm.org/D30917 llvm-svn: 297684
* NewGVN: We pass rle-nonlocal, we just perform the replacement in a way that ↵Daniel Berlin2017-03-131-8/+22
| | | | | | keeps the old name instead of the new one llvm-svn: 297683
* [Thumb1] combine ADDC/SUBC with a negative immediateArtyom Skrobov2017-03-133-14/+26
| | | | | | | | | | | | Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400 Reviewers: efriedma, jmolloy Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297682
* Make FileOutputBuffer fail early if you pass a directory.Rui Ueyama2017-03-131-0/+2
| | | | | | | | | | Previously, it created a temporary directory and then failed when FileOutputBuffer tried to rename that file to the destination file (which is actually a directory name). Differential Revision: https://reviews.llvm.org/D30912 llvm-svn: 297679
* [AVX-512] Fix another case where we are copying from a mask register using ↵Craig Topper2017-03-132-1/+69
| | | | | | | | AH/BH/CH/DH with fastisel. Fixes PR32256. Still planning to do an audit for other possible cases. llvm-svn: 297678
* Fix llvm-symbolizer to navigate both DW_AT_abstract_origin and ↵David Blaikie2017-03-133-42/+28
| | | | | | | | | DW_AT_specification in a single chain In a recent refactoring (r291959) this regressed to only following one or the other, not both, in a single chain. llvm-svn: 297676
* Remove unused lambda captureDavid Blaikie2017-03-131-1/+1
| | | | llvm-svn: 297675
* Fix sign compare warning in unit test by using an explicit unsigned literal ↵David Blaikie2017-03-131-1/+1
| | | | | | suffix llvm-svn: 297674
* [IPRA] Change algorithm for RegUsageInfoCollector.Marcello Maggioni2017-03-131-3/+21
| | | | | | | | | | | | | | | | | The previous algorithm for RegUsageInfoCollector had pretty bad performance on architectures with a lot of registers that alias a lot one another, because we potentially iterate for every register over all the aliasing registers. This costs even more if the function is small and doesn't define a lot of registers. This patch changes the algorithm to one that while iterating over all the registers it will iterate over the aliasing registers only if the register itself is defined. This should be faster based on the assumption that only a subset of the whole LLVM registers set is actually defined in the function. Differential Revision: https://reviews.llvm.org/D30880 llvm-svn: 297673
* [Support] Follow-up for "Test directory iterators and recursive directory ↵Juergen Ributzka2017-03-131-1/+1
| | | | | | | | iterators with broken symlinks." Fix the test by sorting the result vector. llvm-svn: 297672
* GlobalISel: Translate ConstantDataVectorVolkan Keles2017-03-132-0/+47
| | | | | | | | | | | | Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab Reviewed By: qcolombet, dsanders, ab Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30216 llvm-svn: 297670
* [Support] Test directory iterators and recursive directory iterators with ↵Juergen Ributzka2017-03-131-0/+78
| | | | | | | | | | | broken symlinks. This commit adds a unit test to the file system tests to verify the behavior of the directory iterator and recursive directory iterator with broken symlinks. This test is Unix only. llvm-svn: 297669
* Revert "GlobalISel: move vector extract/insert inside generic opcode region."Tim Northover2017-03-132-14/+3
| | | | | | I was writing against an earlier branch and Volkan had already fixed this. llvm-svn: 297668
* [X86][MMX] Fix folding of shift value loads to cover whole 64-bitsSimon Pilgrim2017-03-134-54/+53
| | | | | | | | | | | | rL230225 made the assumption that only the lower 32-bits of an MMX register load is used as a shift value, when in fact the whole 64-bits are reloaded and treated as a i64 to determine the shift value. This patch reverts rL230225 to ensure that the whole 64-bits of memory are folded and ensures that the upper 32-bit are zero'd for cases where the shift value has come from a scalar source. Found during fuzz testing. Differential Revision: https://reviews.llvm.org/D30833 llvm-svn: 297667
* GlobalISel: move vector extract/insert inside generic opcode region.Tim Northover2017-03-132-3/+14
| | | | | | | Otherwise they won't be legalized or selected, causing instruction selection to fail horribly. llvm-svn: 297666
* Revert r295004 (Add MXCSR) due to errors reported by MachineVerifierAndrew Kaylor2017-03-134-38/+25
| | | | | | I am leaving the code in clang which filters mxcsr from the clobber list because that is still technically correct and will be useful again when the MXCSR register is reintroduced. llvm-svn: 297664
* [GlobalISel] Update PRE_ISEL_GENERIC_OPCODE_END markerVolkan Keles2017-03-131-1/+1
| | | | llvm-svn: 297663
* AMDGPU: Re-use TM.getNullPointerValueMatt Arsenault2017-03-131-10/+8
| | | | llvm-svn: 297662
OpenPOWER on IntegriCloud