summaryrefslogtreecommitdiffstats
path: root/llvm/lib/IR
Commit message (Collapse)AuthorAgeFilesLines
* [NFC] Silence Wparentheses warning in DomTreeUpdater, introduced by 336968Erich Keane2018-07-131-2/+2
| | | | llvm-svn: 337001
* [DomTreeUpdater] Ignore updates when both DT and PDT are nullptrsChijun Sima2018-07-131-14/+29
| | | | | | | | | | | | | | | | | Summary: Previously, when both DT and PDT are nullptrs and the UpdateStrategy is Lazy, DomTreeUpdater still pends updates inside. After this patch, DomTreeUpdater will ignore all updates from(`applyUpdates()/insertEdge*()/deleteEdge*()`) in this case. (call `delBB()` still pends BasicBlock deletion until a flush event according to the doc). The behavior of DomTreeUpdater previously documented won't change after the patch. Reviewers: dmgreen, davide, kuhar, brzycki, grosser Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48974 llvm-svn: 336968
* [ThinLTO] Escape module paths when printingAndrew Ng2018-07-121-2/+3
| | | | | | | | | | | | | | | | | | | | We have located a bug in AssemblyWriter::printModuleSummaryIndex(). This function outputs path strings incorrectly. Backslashes in the strings are not correctly escaped. Consequently, if a path name contains a backslash followed by two hexadecimal characters, the sequence is incorrectly interpreted when the output is read by another component. This mangles the path and results in error. This patch fixes this issue by calling printEscapedString() to output the module paths. Patch by Chris Jackson. Differential Revision: https://reviews.llvm.org/D49090 llvm-svn: 336908
* [X86] Remove and autoupgrade the scalar fma intrinsics with masking.Craig Topper2018-07-121-40/+99
| | | | | | This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871
* IR: Skip -print-*-all after -print-*Duncan P. N. Exon Smith2018-07-111-3/+3
| | | | | | | | | | This changes `-print-*` from transformation passes to analysis passes so that `-print-after-all` and `-print-before-all` don't trigger. This avoids some redundant output. Patch by Son Tuan Vu! llvm-svn: 336869
* llvm: Add support for "-fno-delete-null-pointer-checks"Manoj Gupta2018-07-092-4/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613
* [Power9] Add __float128 builtins for Round To OddStefan Pintilie2018-07-091-1/+7
| | | | | | | | | | | | GCC has builtins for these round to odd instructions: __float128 __builtin_sqrtf128_round_to_odd (__float128) __float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128) __float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128) Differential Revision: https://reviews.llvm.org/D47550 llvm-svn: 336578
* Fix DIExpression::ExprOperand::appendToVectorVedant Kumar2018-07-061-6/+2
| | | | | | | | | | | | appendToVector used the wrong overload of SmallVector::append, resulting in it appending the same element to a vector `getSize()` times. This did not cause a problem when initially committed because appendToVector was only used to append 1-element operands. This changes appendToVector to use the correct overload of append(). Testing: ./unittests/IR/IRTests --gtest_filter='*DIExpressionTest*' llvm-svn: 336466
* Remove a redundant null-check in DIExpression::prepend, NFCVedant Kumar2018-07-061-13/+14
| | | | | | | Code outside of an `if (Expr)` block dereferenced `Expr`, so the null check was redundant. llvm-svn: 336465
* Use Type::isIntOrPtrTy where possible, NFCVedant Kumar2018-07-062-9/+5
| | | | | | | | | | | It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() || T.isPointerTy()`. I used Python's re.sub with this regex to update users: r'([\w.\->()]+)isIntegerTy\(\)\s*\|\|\s*\1isPointerTy\(\)' llvm-svn: 336462
* [IR] Fix inconsistent declaration parameter nameFangrui Song2018-07-062-2/+2
| | | | llvm-svn: 336459
* [Local] replaceAllDbgUsesWith: Update debug values before RAUWVedant Kumar2018-07-061-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The replaceAllDbgUsesWith utility helps passes preserve debug info when replacing one value with another. This improves upon the existing insertReplacementDbgValues API by: - Updating debug intrinsics in-place, while preventing use-before-def of the replacement value. - Falling back to salvageDebugInfo when a replacement can't be made. - Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into common utility code. Along with the API change, this teaches replaceAllDbgUsesWith how to create DIExpressions for three basic integer and pointer conversions: - The no-op conversion. Applies when the values have the same width, or have bit-for-bit compatible pointer representations. - Truncation. Applies when the new value is wider than the old one. - Zero/sign extension. Applies when the new value is narrower than the old one. Testing: - check-llvm, check-clang, a stage2 `-g -O3` build of clang, regression/unit testing. - This resolves a number of mis-sized dbg.value diagnostics from Debugify. Differential Revision: https://reviews.llvm.org/D48676 llvm-svn: 336451
* [Constants] extend getBinOpIdentity(); NFCSanjay Patel2018-07-061-24/+41
| | | | | | | | The enhanced version will be used in D48893 and related patches and an almost identical (fadd is different) version is proposed in D28907, so adding this as a preliminary step. llvm-svn: 336444
* [Constant] add undef element query for vector constants; NFCSanjay Patel2018-07-061-0/+10
| | | | | | | This is likely to be used in D48987 and similar patches, so adding it as an NFC preliminary step. llvm-svn: 336442
* [X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead.Craig Topper2018-07-061-0/+16
| | | | | | | | The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector. There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG. llvm-svn: 336416
* [X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or ↵Craig Topper2018-07-061-2/+128
| | | | | | | | | | unmasked 512-bit intrinsics with rounding mode. This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking. This matches how clang uses the intrinsics these days. llvm-svn: 336409
* [X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to ↵Craig Topper2018-07-051-19/+25
| | | | | | | | 'llvm.fma'. Add upgrade tests for all. Still need to remove the AVX512 masked versions. llvm-svn: 336383
* [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵Craig Topper2018-07-051-52/+33
| | | | | | independent FMA and extractelement/insertelement. llvm-svn: 336315
* [X86] Remove some of the packed FMA3 intrinsics since we no longer use them ↵Craig Topper2018-07-051-40/+32
| | | | | | | | | | in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303
* [Constants] add identity constants for fadd/fmulSanjay Patel2018-07-031-1/+6
| | | | | | | | | | | | | | | | As the test diffs show, the current users of getBinOpIdentity() are InstCombine and Reassociate. SLP vectorizer is a candidate for using this functionality too (D28907). The InstCombine shuffle improvements are part of the planned enhancements noted in D48830. InstCombine actually has several other uses of getBinOpIdentity() via SimplifyUsingDistributiveLaws(), but we don't call that for any FP ops. Fixing that might be another part of removing the custom reassociation in InstCombine that is only done for fadd+fmul. llvm-svn: 336215
* Rename lazy initialization functions to reflect behavior (NFC)Teresa Johnson2018-07-031-12/+12
| | | | | | Suggested in review for D48698. llvm-svn: 336207
* [InstCombine] fold shuffle-with-binop and common valueSanjay Patel2018-07-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 llvm-svn: 336196
* [IR] Strip trailing whitespace. NFCBjorn Pettersson2018-07-037-46/+46
| | | | llvm-svn: 336194
* [DebugInfo] Corrections for salvageDebugInfoBjorn Pettersson2018-07-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When salvaging a dbg.declare/dbg.addr we should not add DW_OP_stack_value to the DIExpression (see test/Transforms/InstCombine/salvage-dbg-declare.ll). Consider this example %vla = alloca i32, i64 2 call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression()) Instcombine will turn it into %vla1 = alloca [2 x i32] %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0 call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression()) If the GEP can be eliminated, then the dbg.declare will be salvaged and we should get %vla1 = alloca [2 x i32] call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression()) The problem was that salvageDebugInfo did not recognize dbg.declare as being indirect (%vla1 points to the value, it does not hold the value), so we incorrectly got call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value)) I also made sure that llvm::salvageDebugInfo and DIExpression::prependOpcodes do not add DW_OP_stack_value to the DIExpression in case no new operands are added to the DIExpression. That way we avoid to, unneccessarily, turn a register location expression into an implicit location expression in some situations (see test11 in test/Transforms/LICM/sinking.ll). Reviewers: aprantl, vsk Reviewed By: aprantl, vsk Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D48837 llvm-svn: 336191
* [DebugInfo] Fix PR37395.Shiva Chen2018-07-031-1/+4
| | | | | | | | | | DbgLabelInst has no address as its operands. Differential Revision: https://reviews.llvm.org/D46738 Patch by Hsiangkai Wang. llvm-svn: 336176
* Reappl "[Dominators] Add the DomTreeUpdater class"Jakub Kuderski2018-07-032-0/+513
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]]. This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process. —Prior to the patch— - Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily. - DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated. - Functions receiving DT/DDT need to branch a lot which is currently necessary. - Functions using both DomTree and PostDomTree need to call the update function separately on both trees. - People need to construct an additional DeferredDominance class to use functions only receiving DDT. —After the patch— Patch by Chijun Sima <simachijun@gmail.com>. Reviewers: kuhar, brzycki, dmgreen, grosser, davide Reviewed By: kuhar, brzycki Author: NutshellySima Subscribers: vsk, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D48383 llvm-svn: 336163
* [ThinLTO] Fix printing of aliases for distributed backend indexesTeresa Johnson2018-07-031-2/+8
| | | | | | | | | | | | | | | | Summary: When we import an alias (which will import a copy of the aliasee), but aren't going to import the aliasee directly, the distributed backend index will not contain the aliasee summary. Handle this in the summary assembly printer by printing "null" as the aliasee. Reviewers: davidxl, dexonsmith Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits Differential Revision: https://reviews.llvm.org/D48699 llvm-svn: 336160
* [ThinLTO] Fix printing of module paths for distributed backend indexesTeresa Johnson2018-07-021-17/+41
| | | | | | | | | | | | | | | | Summary: In the individual index files emitted for distributed ThinLTO backends, the module path ids are not contiguous. Assign slots to module paths in order to handle this better and also to get contiguous numbering in the summary assembly. Reviewers: davidxl, dexonsmith Subscribers: mehdi_amini, inglorion, eraman, llvm-commits, steven_wu Differential Revision: https://reviews.llvm.org/D48698 llvm-svn: 336148
* Revert "[Dominators] Add the DomTreeUpdater class"Jakub Kuderski2018-07-022-512/+0
| | | | | | | | Temporary revert because of a failing test on some buildbots. This reverts commit r336114. llvm-svn: 336117
* [Dominators] Add the DomTreeUpdater classJakub Kuderski2018-07-022-0/+512
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]]. This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process. —Prior to the patch— - Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily. - DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated. - Functions receiving DT/DDT need to branch a lot which is currently necessary. - Functions using both DomTree and PostDomTree need to call the update function separately on both trees. - People need to construct an additional DeferredDominance class to use functions only receiving DDT. —After the patch— Patch by Chijun Sima <simachijun@gmail.com>. Reviewers: kuhar, brzycki, dmgreen, grosser, davide Reviewed By: kuhar, brzycki Subscribers: vsk, mgorny, llvm-commits Author: NutshellySima Differential Revision: https://reviews.llvm.org/D48383 llvm-svn: 336114
* Implement strip.invariant.groupPiotr Padlewski2018-07-021-1/+2
| | | | | | | | | | | | | | | | Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073
* [X86] Remove masking from avx512 rotate intrinsics. Use select in IR instead.Craig Topper2018-06-301-0/+64
| | | | llvm-svn: 336035
* [LLVMContext] Detecting leaked instructions with metadataVedant Kumar2018-06-291-0/+8
| | | | | | | | | | When instructions with metadata are accidentally leaked, the result is a difficult-to-find memory corruption in ~LLVMContextImpl that leads to random crashes. Patch by Arvīds Kokins! llvm-svn: 336010
* [X86] Remove masking from the avx512 packed sqrt intrinsics. Use select in ↵Craig Topper2018-06-291-8/+15
| | | | | | | | IR instead. While there improve the coverage of the intrinsic testing and add fast-isel tests. llvm-svn: 335944
* Revert "Add support for generating a call graph profile from Branch ↵Benjamin Kramer2018-06-281-20/+0
| | | | | | | | Frequency Info." This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost. llvm-svn: 335851
* Add support for generating a call graph profile from Branch Frequency Info.Michael J. Spencer2018-06-271-0/+20
| | | | | | | | | | | | | | | | | | | | | === Generating the CG Profile === The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: ``` !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} ``` Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335794
* [X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics ↵Craig Topper2018-06-271-111/+65
| | | | | | | | that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". llvm-svn: 335744
* Rename skipDebugInfo -> skipDebugIntrinsics, NFCVedant Kumar2018-06-261-1/+1
| | | | | | | | | | | | | This addresses post-commit feedback about the name 'skipDebugInfo' being misleading. This name could be interpreted as meaning 'a function that skips instructions with debug locations'. The new name, 'skipDebugIntrinsics', makes it clear that this function only skips debug info intrinsics. Thanks to Adrian Prantl for pointing this out! llvm-svn: 335667
* ConstantFold: Don't fold global address vs. null for addrspace != 0Matt Arsenault2018-06-261-3/+6
| | | | | | | | | | | Not sure why this logic seems to be repeated in 2 different places, one called by the other. On AMDGPU addrspace(3) globals start allocating at 0, so these checks will be incorrect (not that real code actually tries to compare these addresses) llvm-svn: 335649
* [ConstantRange] Add support of mul in makeGuaranteedNoWrapRegion.Tim Shen2018-06-261-0/+58
| | | | | | | | | | | | Summary: This is trying to add support for r334428. Reviewers: sanjoy Subscribers: jlebar, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D48399 llvm-svn: 335646
* Improve ConvertDebugDeclareToDebugValueBjorn Pettersson2018-06-261-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a follow-up to r334830 and r335031. In the valueCoversEntireFragment check we now also handle the situation when there is a variable length array (VLA) involved, and the length of the array has been reduced to a constant. The ConvertDebugDeclareToDebugValue functions that are related to PHI nodes and load instructions now avoid inserting dbg.value intrinsics when the value does not, for certain, cover the variable/fragment that should be described. In r334830 we assumed that the value always covered the entire var/fragment and we had assertions in the code to show that assumption. However, those asserts failed when compiling code with VLAs, so we removed the asserts in r335031. Now when we know that the valueCoversEntireFragment check can fail also for PHI/Load instructions we avoid to insert the faulty dbg.value intrinsic in such situations. Compared to the Store instruction scenario we simply drop the dbg.value here (as the variable does not change its value due to PHI/Load, so an earlier dbg.value describing the variable should still be valid). Reviewers: aprantl, vsk, efriedma Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48547 llvm-svn: 335580
* [X86] Redefine avx512 packed fpclass intrinsics to return a vXi1 mask and ↵Craig Topper2018-06-261-0/+43
| | | | | | | | | | | | implement the mask input argument using an 'and' IR instruction. This recommits r335562 and 335563 as a single commit. The frontend will surround the intrinsic with the appropriate marshalling to/from a scalar type to match the sigature of the builtin that software expects. By exposing the vXi1 type directly in the llvm intrinsic we make it available to optimizers much earlier. This can enable the scalar marshalling code to be optimized away. llvm-svn: 335568
* Revert r335562 and 335563 "[X86] Redefine avx512 packed fpclass intrinsics ↵Craig Topper2018-06-261-43/+0
| | | | | | | | to return a vXi1 mask and implement the mask input argument using an 'and' IR instruction." These were supposed to have been squashed to a single commit. llvm-svn: 335566
* fooCraig Topper2018-06-261-0/+43
| | | | llvm-svn: 335562
* SafepointIRVerifier should ignore dead blocks and dead edgesArtur Pilipenko2018-06-251-28/+189
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not only should SafepointIRVerifier ignore unreachable blocks (as suggested in https://reviews.llvm.org/D47011) but it also has to ignore dead blocks. In @test2 (see the new tests): br i1 true, label %right, label %left left: ... right: ... merge: %val = phi i8 addrspace(1)* [ ..., %left ], [ ..., %right ] use %val both left and right branches are reachable. If they collide then SafepointIRVerifier reports an error. Because of the foldable branch condition GVN finds the left branch dead and removes the phi node entry that merges values from right and left. Then the use comes from the right branch. This results in no collision. So, SafepointIRVerifier ends up in different results depending on either GVN is run or not. To solve this issue this patch adds Dead Block detection to SafepointIRVerifier which can ignore dead blocks while validating IR. The Dead Block detection algorithm is taken from GVN but modified to not split critical edges. That is needed to keep CFG unchanged by SafepointIRVerifier. Patch by Yevgeny Rouban. Reviewed By: anna, apilipenko, DaniilSuchkov Differential Revision: https://reviews.llvm.org/D47441 llvm-svn: 335473
* [IR] Split Intrinsics.inc into enums and implementationsReid Kleckner2018-06-231-7/+7
| | | | | | | | | | | | | | | | | | | Implements PR34259 Intrinsics.h is a very popular header. Most LLVM TUs care about things like dbg_value, but they don't care how they are implemented. After I split these out, IntrinsicImpl.inc is 1.7 MB, so this saves each LLVM TU from scanning 1.7 MB of source that gets pre-processed away. It also means we can modify intrinsic properties without triggering a full rebuild, but that's probably less of a win. I think the next best thing to do would be to split out the target intrinsics into their own header. Very, very few TUs care about target-specific intrinsics. It's very hard to split up the target independent intrinsics like llvm.expect, assume, and dbg.value, though. llvm-svn: 335407
* [IR] Use Instruction::isBinaryOp helper instead of raw enum range tests. NFCI.Simon Pilgrim2018-06-222-6/+3
| | | | llvm-svn: 335335
* Revert r335306 (and r335314) - the Call Graph Profile pass.Chandler Carruth2018-06-221-20/+0
| | | | | | | | | | | This is the first pass in the main pipeline to use the legacy PM's ability to run function analyses "on demand". Unfortunately, it turns out there are bugs in that somewhat-hacky approach. At the very least, it leaks memory and doesn't support -debug-pass=Structure. Unclear if there are larger issues or not, but this should get the sanitizer bots back to green by fixing the memory leaks. llvm-svn: 335320
* [Instrumentation] Add Call Graph Profile passMichael J. Spencer2018-06-211-0/+20
| | | | | | | | | | | | | | | | | | | | This patch adds support for generating a call graph profile from Branch Frequency Info. The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335306
* [X86] Remove masking from 512-bit floating max/min intrinsics. Use select ↵Craig Topper2018-06-211-12/+32
| | | | | | instruction instead. llvm-svn: 335199
OpenPOWER on IntegriCloud