summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* DebugInfo: preparation to implement DW_AT_alignmentVictor Leschuk2016-10-201-3/+5
| | | | | | | | | | | | - Add alignment attribute to DIVariable family - Modify bitcode format to match new DIVariable representation - Update tests to match these changes (also add bitcode upgrade test) - Expect that frontend passes non-zero align value only when it is not default (was forcibly aligned by alignas()/_Alignas()/__atribute__(aligned()) Differential Revision: https://reviews.llvm.org/D25073 llvm-svn: 284678
* [GlobalMerge] Handle non-landingpad EH padsReid Kleckner2016-10-191-14/+10
| | | | | | | | | This code crashed on funclet-style EH instructions such as catchpad, catchswitch, and cleanuppad. Just treat all EH pad instructions equivalently and avoid merging the globals they reference through any use. llvm-svn: 284633
* Merged nested ifs. NFCI.Simon Pilgrim2016-10-191-7/+6
| | | | llvm-svn: 284616
* [DAGCombiner] Add general constant vector support to (shl (add x, c1), c2) ↵Simon Pilgrim2016-10-191-4/+5
| | | | | | | | -> (add (shl x, c2), c1 << c2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284613
* [WinEH] Allow catchpads to reuse the same catch objectReid Kleckner2016-10-191-4/+7
| | | | | | This code used a regular when it should have used a multimap. llvm-svn: 284612
* [DAG] optimize negation of boolSanjay Patel2016-10-191-2/+19
| | | | | | | | | | | | | | | | Use mask and negate for legalization of i1 source type with SIGN_EXTEND_INREG. With the mask, this should be no worse than 2 shifts. The mask can be eliminated in some cases, so that should be better than 2 shifts. This change exposed some missing folds related to negation: https://reviews.llvm.org/rL284239 https://reviews.llvm.org/rL284395 There may be others, so please let me know if you see any regressions. Differential Revision: https://reviews.llvm.org/D25485 llvm-svn: 284611
* [DAGCombiner] Add general constant vector support to (shl (sra x, c1), c1) ↵Simon Pilgrim2016-10-191-7/+6
| | | | | | | | -> (and x, (shl -1, c1)) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284608
* [DAGCombiner] Add general constant vector support to (shl (mul x, c1), c2) ↵Simon Pilgrim2016-10-191-5/+6
| | | | | | | | -> (mul x, c1 << c2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284607
* Revert r284604. A.K.A. "TMP"Tim Northover2016-10-191-33/+0
| | | | | | Committed by mistake. llvm-svn: 284606
* TMPTim Northover2016-10-191-0/+33
| | | | llvm-svn: 284604
* GlobalISel: support translating volatile loads and stores.Tim Northover2016-10-191-14/+17
| | | | llvm-svn: 284603
* [DAGCombiner] Just call isConstOrConstSplat directly. NFCI.Simon Pilgrim2016-10-191-8/+4
| | | | | | This will get the same ConstantSDNode scalar or vector splat value as the current separate dyn_cast<ConstantSDNode> / isVector() approach. llvm-svn: 284578
* [DAGCombine] Generalize distributeTruncateThroughAnd to work with any ↵Simon Pilgrim2016-10-191-13/+9
| | | | | | non-opaque constant or constant vector llvm-svn: 284574
* Revert r284545 again as the regression in ppc still exists. There is bug in ↵Dehao Chen2016-10-191-18/+0
| | | | | | | | MBPI exposed by th patch. Also update the section.ll to fix non-x86 failure. llvm-svn: 284563
* Using branch probability to guide critical edge splitting.Dehao Chen2016-10-181-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass. The performance impact on speccpu2006 on Intel sandybridge machines: spec/2006/fp/C++/444.namd 25.3 +0.26% spec/2006/fp/C++/447.dealII 45.96 -0.10% spec/2006/fp/C++/450.soplex 41.97 +1.49% spec/2006/fp/C++/453.povray 36.83 -0.96% spec/2006/fp/C/433.milc 23.81 +0.32% spec/2006/fp/C/470.lbm 41.17 +0.34% spec/2006/fp/C/482.sphinx3 48.13 +0.69% spec/2006/int/C++/471.omnetpp 22.45 +3.25% spec/2006/int/C++/473.astar 21.35 -2.06% spec/2006/int/C++/483.xalancbmk 36.02 -2.39% spec/2006/int/C/400.perlbench 33.7 -0.17% spec/2006/int/C/401.bzip2 22.9 +0.52% spec/2006/int/C/403.gcc 32.42 -0.54% spec/2006/int/C/429.mcf 39.59 +0.19% spec/2006/int/C/445.gobmk 26.98 -0.00% spec/2006/int/C/456.hmmer 24.52 -0.18% spec/2006/int/C/458.sjeng 28.26 +0.02% spec/2006/int/C/462.libquantum 55.44 +3.74% spec/2006/int/C/464.h264ref 46.67 -0.39% geometric mean +0.20% Manually checked 473 and 471 to verify the diff is in the noise range. Reviewers: rengolin, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24818 llvm-svn: 284545
* revert r284541.Dehao Chen2016-10-181-17/+0
| | | | llvm-svn: 284544
* Using branch probability to guide critical edge splitting.Dehao Chen2016-10-181-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass. The performance impact on speccpu2006 on Intel sandybridge machines: spec/2006/fp/C++/444.namd 25.3 +0.26% spec/2006/fp/C++/447.dealII 45.96 -0.10% spec/2006/fp/C++/450.soplex 41.97 +1.49% spec/2006/fp/C++/453.povray 36.83 -0.96% spec/2006/fp/C/433.milc 23.81 +0.32% spec/2006/fp/C/470.lbm 41.17 +0.34% spec/2006/fp/C/482.sphinx3 48.13 +0.69% spec/2006/int/C++/471.omnetpp 22.45 +3.25% spec/2006/int/C++/473.astar 21.35 -2.06% spec/2006/int/C++/483.xalancbmk 36.02 -2.39% spec/2006/int/C/400.perlbench 33.7 -0.17% spec/2006/int/C/401.bzip2 22.9 +0.52% spec/2006/int/C/403.gcc 32.42 -0.54% spec/2006/int/C/429.mcf 39.59 +0.19% spec/2006/int/C/445.gobmk 26.98 -0.00% spec/2006/int/C/456.hmmer 24.52 -0.18% spec/2006/int/C/458.sjeng 28.26 +0.02% spec/2006/int/C/462.libquantum 55.44 +3.74% spec/2006/int/C/464.h264ref 46.67 -0.39% geometric mean +0.20% Manually checked 473 and 471 to verify the diff is in the noise range. Reviewers: rengolin, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24818 llvm-svn: 284541
* Use profile info to set function section prefix to group hot/cold functions.Dehao Chen2016-10-182-4/+26
| | | | | | | | | | | | | | | | Summary: The original implementation is in r261607, which was reverted in r269726 to accomendate the ProfileSummaryInfo analysis pass. The new implementation: 1. add a new metadata for function section prefix 2. query against ProfileSummaryInfo in CGP to set the correct section prefix for each function 3. output the section prefix set by CGP Reviewers: davidxl, eraman Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24989 llvm-svn: 284533
* GlobalISel: translate the @llvm.objectsize intrinsic.Tim Northover2016-10-181-0/+7
| | | | llvm-svn: 284527
* GlobalISel: translate memcpy intrinsics.Tim Northover2016-10-181-0/+22
| | | | llvm-svn: 284525
* [InterleavedAccessPass] Remove global variable.Benjamin Kramer2016-10-181-6/+9
| | | | | | | This is a threading hazard and rightfully complained about by tsan. No functionality change. llvm-svn: 284515
* revert r284495: [Target] remove TargetRecip classSanjay Patel2016-10-182-217/+26
| | | | | | There's something wrong with the StringRef usage while parsing the attribute string. llvm-svn: 284513
* [Target] remove TargetRecip class; move reciprocal estimate isel ↵Sanjay Patel2016-10-182-26/+217
| | | | | | | | | | | | | | | | | | | | | | | | | | functionality to TargetLowering This is a follow-up to D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 llvm-svn: 284495
* [DAGCombiner] Add splatted vector support to (udiv x, (shl pow2, y)) -> x ↵Simon Pilgrim2016-10-181-2/+3
| | | | | | >>u (log2(pow2)+y) llvm-svn: 284491
* DebugInfo: change alignment type from uint64_t to uint32_t to save space.Victor Leschuk2016-10-181-2/+3
| | | | | | | | | In futher patches we shall have alignment field added to DIVariable family and switching from uint64_t to uint32_t will save 4 bytes per variable. Differential Revision: https://reviews.llvm.org/D25620 llvm-svn: 284482
* Strip trailing whitespace (NFCI)Simon Pilgrim2016-10-181-1/+1
| | | | llvm-svn: 284478
* [XRay] Support for for tail calls for ARM no-ThumbDean Michael Berris2016-10-181-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds simplified support for tail calls on ARM with XRay instrumentation. Known issue: compiled with generic flags: `-O3 -g -fxray-instrument -Wall -std=c++14 -ffunction-sections -fdata-sections` (this list doesn't include my specific flags like --target=armv7-linux-gnueabihf etc.), the following program #include <cstdio> #include <cassert> #include <xray/xray_interface.h> [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fC() { std::printf("In fC()\n"); } [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fB() { std::printf("In fB()\n"); fC(); } [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fA() { std::printf("In fA()\n"); fB(); } // Avoid infinite recursion in case the logging function is instrumented (so calls logging // function again). [[clang::xray_never_instrument]] void simplyPrint(int32_t functionId, XRayEntryType xret) { printf("XRay: functionId=%d type=%d.\n", int(functionId), int(xret)); } int main(int argc, char* argv[]) { __xray_set_handler(simplyPrint); printf("Patching...\n"); __xray_patch(); fA(); printf("Unpatching...\n"); __xray_unpatch(); fA(); return 0; } gives the following output: Patching... XRay: functionId=3 type=0. In fA() XRay: functionId=3 type=1. XRay: functionId=2 type=0. In fB() XRay: functionId=2 type=1. XRay: functionId=1 type=0. XRay: functionId=1 type=1. In fC() Unpatching... In fA() In fB() In fC() So for function fC() the exit sled seems to be called too much before function exit: before printing In fC(). Debugging shows that the above happens because printf from fC is also called as a tail call. So first the exit sled of fC is executed, and only then printf is jumped into. So it seems we can't do anything about this with the current approach (i.e. within the simplification described in https://reviews.llvm.org/D23988 ). Differential Revision: https://reviews.llvm.org/D25030 llvm-svn: 284456
* Fix differences in codegen between Linux and Windows toolchainsMandeep Singh Grang2016-10-182-4/+7
| | | | | | | | | | | | | | | | | Summary: There are differences in codegen between Linux and Windows due to: 1. Using std::sort which uses quicksort which is a non-stable sort. 2. Iterating over Set data structure where the iteration order is non deterministic. Reviewers: arsenm, grosbach, junbuml, zinob, MatzeB Subscribers: MatzeB, wdng Differential Revision: https://reviews.llvm.org/D25695 llvm-svn: 284441
* [DAG] use isConstOrConstSplat in ComputeNumSignBits to optimize SRASanjay Patel2016-10-171-1/+1
| | | | | | | | | | | | | | The scalar version of this pattern was noted in: https://reviews.llvm.org/D25485 and fixed with: https://reviews.llvm.org/rL284395 More refactoring of the constant/splat helpers is needed and will happen in follow-up patches. Differential Revision: https://reviews.llvm.org/D25685 llvm-svn: 284424
* [DAG] make isConstOrConstSplat and isConstOrConstSplatFP more accessible; NFCSanjay Patel2016-10-172-38/+34
| | | | | | | | | | | | As noted in: https://reviews.llvm.org/D25685 This is the next-to-smallest step needed to enable the ComputeNumSignBits fix in that patch. In a minor attempt to keep some structure, we're pulling the FP helper over along with its integer sibling, but clearly we can and should do more refactoring of the similar helper functions in DAGCombiner and SelectionDAG to simplify and not duplicate functionality. llvm-svn: 284421
* Test commit.Michael LeMay2016-10-171-1/+1
| | | | llvm-svn: 284411
* [DAG] optimize away an arithmetic-right-shift of a 0 or -1 valueSanjay Patel2016-10-171-0/+4
| | | | | | | | | This came up as part of: https://reviews.llvm.org/D25485 Note that the vector case is missed because ComputeNumSignBits() is deficient for vectors. llvm-svn: 284395
* [SDAG] Use ABI type alignment for constant pools when optimizing for sizeJames Molloy2016-10-171-1/+3
| | | | | | | | SelectionDAG::getConstantPool will automatically determine an appropriate alignment if one is not specified. It does this by querying the type's preferred alignment. This can end up creating quite a lot of padding when the preferred alignment for vectors is 128. In optimize-for-size mode, it makes sense to instead query the ABI type alignment which is often smaller and causes less padding. llvm-svn: 284381
* [CodeGenPrepare] When moving a zext near to its associated load, do not ↵Andrea Di Biagio2016-10-171-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | retain the original debug location. CodeGenPrepare knows how to move a zext of a load into the same basic block where the load lives. The goal is to help ISel match a zero-extending load instead of two separated instructions. CGP attempts to move a zext computation even if it lives in a basic block that does not post-dominate the load's basic block. That means, the hoisted zext may be speculated. Preserving the zext location would hurt the debugging experience and the quality of sample pgo. With this patch, when moving a zext near to its associated load, CGP no longer propagates the zext's debug location. Instead, CGP conservatively reuses the same debug location for the load and the zext. An alternative approach would be to assign an artificial line-0 location to the zext. However we don't want to over-use the 'line-0' for this particular case because it would have a size cost in the line-table section for no additional benefit. Differential Revision: https://reviews.llvm.org/D25611 llvm-svn: 284377
* [MachineMemOperand] Move synchronization scope and atomic orderings from ↵Konstantin Zhuravlyov2016-10-156-74/+57
| | | | | | | | SDNode to MachineMemOperand, and remove redundant getAtomic* member functions from SelectionDAG. Differential Revision: https://reviews.llvm.org/D24577 llvm-svn: 284312
* GlobalISel: rename legalizer components to match others.Tim Northover2016-10-147-80/+78
| | | | | | | | | | The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287
* [DAG] avoid creating illegal node when transforming negated shifted sign bitSanjay Patel2016-10-141-2/+3
| | | | | | | | Eli noted this potential bug in the post-commit thread for: https://reviews.llvm.org/rL284239 ...but I'm not sure how to trigger it, so there's no test case yet. llvm-svn: 284268
* TargetLowering: Add SimplifyDemandedBits() helper to TargetLoweringOptTom Stellard2016-10-141-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The main purpose of this new helper is to enable simplifying operations that have multiple uses. SimplifyDemandedBits does not handle multiple uses currently, and this new function makes it possible to optimize: and v1, v0, 0xffffff mul24 v2, v1, v1 ; Multiply ignoring high 8-bits. To: mul24 v2, v0, v0 Where before this would not be optimized, because v1 has multiple uses. Reviewers: bogner, arsenm Subscribers: nhaehnle, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24964 llvm-svn: 284266
* Add a pass to optimize patterns of vectorized interleaved memory accesses forDavid L Kreitzer2016-10-141-0/+5
| | | | | | | | | | | | | X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 llvm-svn: 284260
* [safestack] Use non-thread-local unsafe stack pointer for Contiki OSDavid L Kreitzer2016-10-142-50/+34
| | | | | | | | Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D19852 llvm-svn: 284254
* Revert "In preparation for removing getNameWithPrefix off ofEric Christopher2016-10-141-8/+1
| | | | | | | | | TargetMachine," as it's causing sanitizer/memory issues until I can track down this set. This reverts commit r284203 llvm-svn: 284252
* [DAG] add folds for negated shifted sign bitSanjay Patel2016-10-141-0/+13
| | | | | | | | | The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 llvm-svn: 284239
* Fix use-after-freesNicolai Haehnle2016-10-141-2/+2
| | | | | | Extracted from D25313, as suggested by Justin Bogner. llvm-svn: 284220
* [DAGCombiner] Teach createBuildVecShuffle to handle cases where input ↵Craig Topper2016-10-141-5/+9
| | | | | | | | vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. llvm-svn: 284204
* In preparation for removing getNameWithPrefix off of TargetMachine,Eric Christopher2016-10-141-1/+8
| | | | | | | sink the current behavior into the callers and sink TargetMachine::getNameWithPrefix into TargetMachine::getSymbol. llvm-svn: 284203
* Tidy the calls to getCurrentSection().first -> getCurrentSectionOnly to helpEric Christopher2016-10-141-1/+1
| | | | | | readability a bit. llvm-svn: 284202
* [DAG] hoist DL(N) and fix formatting; NFCSanjay Patel2016-10-131-24/+31
| | | | llvm-svn: 284170
* LegalizeDAG: Implement PROMOTE for ISD::BITREVERSETom Stellard2016-10-131-1/+2
| | | | | | | | | | | | | | | Summary: This operation is promoted the same way was ISD::BSWAP. This will prevent a regression in test/Target/AMDGOU/bitreverse.ll when i16 support is implemented. Reviewers: bogner, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25202 llvm-svn: 284163
* [safestack] Reapply r283248 after moving X86-targeted SafeStack tests intoDavid L Kreitzer2016-10-131-7/+6
| | | | | | | | | | | | the X86 subdirectory. Original commit message: Requires a valid TargetMachine to be passed to the SafeStack pass. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24896 llvm-svn: 284161
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-10-132-121/+272
| | | | | | | | | UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157
OpenPOWER on IntegriCloud