summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [HotColdSplit] Relax requirement that the cold sink block be extractableVedant Kumar2019-01-171-5/+2
| | | | | | | | | Relaxing this requirement creates opportunities to split code dominated by an EH pad. Tested on LNT+externals. llvm-svn: 351483
* [HotColdSplit] Simplify tests by lowering their splitting thresholdsVedant Kumar2019-01-171-4/+9
| | | | | | | | This gets rid of the brittle/mysterious calls to @sink()/@sideeffect() peppered throughout the test cases. They are no longer needed to force splitting to occur. llvm-svn: 351480
* [SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlinkWei Mi2019-01-172-5/+24
| | | | | | | | | | | | | If the sample profile has no inlining hierachy information included, we call the sample profile is flattened. For flattened profile, in ThinLTO postlink phase, SampleProfileLoader's hot function inlining and profile annotation will do nothing, so it is better to save the effort to read in the profile and run the sample profile loader pass. It is helpful for reducing compile time when the flattened profile is huge. Differential Revision: https://reviews.llvm.org/D54819 llvm-svn: 351476
* [InstCombine] Don't sink dynamic allocasReid Kleckner2019-01-171-3/+5
| | | | | | | | | | | | | | | | | | | Summary: InstCombine's sinking algorithm only thinks about memory. It doesn't think about non-memory constraints like stack object lifetime. It can sink dynamic allocas across a stacksave call, which may be used with stackrestore, which can incorrectly reduce the lifetime of the dynamic alloca. Fixes PR40365 Reviewers: hfinkel, efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56872 llvm-svn: 351475
* NFC: Make the copies of the demangler byte-for-byte identicalErik Pilkington2019-01-173-11/+12
| | | | | | | | | | | | | | With this patch, the copies of the files ItaniumDemangle.h, StringView.h, and Utility.h are kept byte-for-byte in sync between libcxxabi and llvm. All differences (namespaces, fallthrough, and unreachable macros) are defined in each copies' DemanglerConfig.h. This patch also adds a script to copy changes from libcxxabi (cp-to-llvm.sh), and a README.txt explaining the situation. Differential revision: https://reviews.llvm.org/D53538 llvm-svn: 351474
* Fix the buildbot failure introduced by r351404Sanjin Sijaric2019-01-171-1/+3
| | | | | | | | | EXPENSIVE_CHECKS buildbots are failing due to r351404. Add x1 as live in to the funclet basic block for SEH funclets, as well as -verify-machineinstrs to the test case that triggered the failure. llvm-svn: 351472
* [WebAssembly] Fixed objdump not parsing function headers.Wouter van Oortmerssen2019-01-176-19/+86
| | | | | | | | | | | | | | | | | | | | | | | Summary: objdump was interpreting the function header containing the locals declaration as instructions. To parse these without injecting target specific code in objdump, MCDisassembler::onSymbolStart was added to be implemented by the WebAssembly implemention. WasmObjectFile now returns a code offset for the "address" of a symbol, rather than the index. This is also more in-line with what other targets do. Also ensured that the AsmParser correctly puts each function in its own segment to enable this test case. Reviewers: sbc100, dschuff Subscribers: jgravelle-google, aheejin, sunfish, rupprecht, llvm-commits Differential Revision: https://reviews.llvm.org/D56684 llvm-svn: 351460
* Revert "[ThinLTO] Add summary entries for index-based WPD"Teresa Johnson2019-01-179-463/+25
| | | | | | | | Mistaken commit of something still under review! This reverts commit r351453. llvm-svn: 351455
* [ThinLTO] Add summary entries for index-based WPDTeresa Johnson2019-01-179-25/+463
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If LTOUnit splitting is disabled, the module summary analysis computes the summary information necessary to perform single implementation devirtualization during the thin link with the index and no IR. The information collected from the regular LTO IR in the current hybrid WPD algorithm is summarized, including: 1) For vtable definitions, record the function pointers and their offset within the vtable initializer (subsumes the information collected from IR by tryFindVirtualCallTargets). 2) A record for each type metadata summarizing the vtable definitions decorated with that metadata (subsumes the TypeIdentiferMap collected from IR). Also added are the necessary bitcode records, and the corresponding assembly support. The index-based WPD will be sent as a follow-on. Depends on D53890. Reviewers: pcc Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54815 llvm-svn: 351453
* Move demangling function from llvm-objdump to Demangle libraryJames Henderson2019-01-172-0/+31
| | | | | | | | | | | | | | | | This allows it to be used in an upcoming llvm-readobj change. A small change in internal behaviour of the function is to always call the microsoftDemangle function if the string does not have an itanium encoding prefix, rather than only if it starts with '?'. This is harmless because the microsoftDemangle function does the same check already. Reviewed by: grimar, erik.pilkington Differential Revision: https://reviews.llvm.org/D56721 llvm-svn: 351448
* [LoopSimplifyCFG] Form LCSSA when a parent loop becomes a siblingMax Kazantsev2019-01-171-0/+9
| | | | | | | | | | | | | During the transforms in LoopSimplifyCFG, when we remove a dead exiting edge, the parent loop may stop being reachable from the child loop, and therefore they become siblings. If the former child loop had uses of some values from its former parent loop, now such uses will require LCSSA Phis, even if they weren't needed before. So we must form LCSSA for all loops that stopped being ancestors of the current loop in this case. Differential Revision: https://reviews.llvm.org/D56144 Reviewed By: fedor.sergeev llvm-svn: 351434
* [LoopSimplifyCFG] Fix order of deletion of complex dead subloopsMax Kazantsev2019-01-171-2/+3
| | | | | | | | | | | | | | | | | Function `DeleteDeadBlock` requires that all predecessors of a block being deleted have already been deleted, with the exception of a single-block loop. When we use it for removal of dead subloops that contain more than one block, we may not fulfull this requirement and fail an assertion. This patch replaces invocation of `DeleteDeadBlock` with a generalized version `DeleteDeadBlocks` that is able to deal with multiple dead blocks, even if they contain some cycles. Differential Revision: https://reviews.llvm.org/D56121 Reviewed By: fedor.sergeev llvm-svn: 351433
* Allow FP types for atomicrmw xchgMatt Arsenault2019-01-178-9/+101
| | | | llvm-svn: 351427
* [MC] Remove unused variableBenjamin Kramer2019-01-171-1/+0
| | | | llvm-svn: 351426
* Fix capitalization. NFCDiana Picus2019-01-171-6/+6
| | | | llvm-svn: 351425
* [ARM GlobalISel] Allow calls to varargs functionsDiana Picus2019-01-171-3/+4
| | | | | | | | | Allow varargs functions to be called, both in arm and thumb mode. This boils down to choosing the correct calling convention, which we can easily test by making sure arm_aapcscc is used instead of arm_aapcs_vfpcc when the callee is variadic. llvm-svn: 351424
* [RISCV] Add codegen support for RV64AAlex Bradbury2019-01-173-48/+230
| | | | | | | | | | | | | | | | | | | | | In order to support codegen RV64A, this patch: * Introduces masked atomics intrinsics for atomicrmw operations and cmpxchg that use the i64 type. These are ultimately lowered to masked operations using lr.w/sc.w, but we need to use these alternate intrinsics for RV64 because i32 is not legal * Modifies RISCVExpandPseudoInsts.cpp to handle PseudoAtomicLoadNand64 and PseudoCmpXchg64 * Modifies the AtomicExpandPass hooks in RISCVTargetLowering to sext/trunc as needed for RV64 and to select the i64 intrinsic IDs when necessary * Adds appropriate patterns to RISCVInstrInfoA.td * Updates test/CodeGen/RISCV/atomic-*.ll to show RV64A support This ends up being a fairly mechanical change, as the logic for RV32A is effectively reused. Differential Revision: https://reviews.llvm.org/D53233 llvm-svn: 351422
* [ARM64][Windows] Share unwind codes between epiloguesSanjin Sijaric2019-01-171-2/+51
| | | | | | | | | | | | | | There are cases where we have multiple epilogues that have the exact same unwind code sequence. In that case, the epilogues can share the same unwind codes in the .xdata section. This should get us past the assert "SEH unwind data splitting not yet implemented" in many cases. We still need to add support for generating multiple .pdata/.xdata sections for those functions that need to be split into fragments. Differential Revision: https://reviews.llvm.org/D56813 llvm-svn: 351421
* [NFC] Factor out some local varsMax Kazantsev2019-01-171-7/+9
| | | | llvm-svn: 351416
* [WebAssembly] Parse llvm.ident into producers sectionThomas Lively2019-01-174-17/+123
| | | | llvm-svn: 351413
* [MergeFunc] Prevent silent miscompile of vararg functionsVedant Kumar2019-01-171-1/+7
| | | | | | | | | | | | The function merging pass miscompiles identical vararg functions. The forwarding thunk it emits doesn't forward the full variable-length list of arguments. Disable merging for vararg functions for now. I've filed llvm.org/PR40345 to track the issue. rdar://47326238 llvm-svn: 351411
* Revert "[WebAssembly] Parse llvm.ident into producers section"Thomas Lively2019-01-174-123/+17
| | | | | | This reverts commit eccdbba3a02a33e13b5262e92200a33e2ead873d. llvm-svn: 351410
* [FunctionComparator] Consider tail call kindsVedant Kumar2019-01-171-22/+11
| | | | | | | | | | | Essentially, do not treat `call` and `musttail call` as the same thing. As a drive-by, fold CallInst and InvokeInst handling together using the CallSite helper. Differential Revision: https://reviews.llvm.org/D56815 llvm-svn: 351405
* [SEH] [ARM64] Retrieve the frame pointer from SEH funcletsSanjin Sijaric2019-01-171-0/+11
| | | | | | | The Windows ARM64 runtime passes the establisher frame to funclets as the first argument. llvm-svn: 351404
* [WebAssembly] Parse llvm.ident into producers sectionThomas Lively2019-01-164-17/+123
| | | | | | | | | | | | | | Summary: Everything before the word "version" is the tool, and everything after the word "version" is the version. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56742 llvm-svn: 351399
* Fix a mistake in rL351392.Wei Mi2019-01-161-1/+1
| | | | | | PGOInstrGen should be initialized to "" instead of false. llvm-svn: 351397
* [AsmPrinter] Collapse .loc 0 0 directivesJonas Devlieghere2019-01-161-1/+1
| | | | | | | | | | | | | | | | Currently we do not always collapse subsequent .loc 0 0 directives. The reason is that we were checking for a PrevInstLoc which is not set when we emit a line-0 record. We should only check the LastAsmLine, which seems to be created exactly for this purpose. // When we emit a line-0 record, we don't update PrevInstLoc; so look at // the last line number actually emitted, to see if it was line 0. unsigned LastAsmLine = Asm->OutStreamer->getContext().getCurrentDwarfLoc().getLine(); Differential revision: https://reviews.llvm.org/D56767 llvm-svn: 351395
* [PGO] Make pgo related options in opt more consistent.Wei Mi2019-01-161-17/+4
| | | | | | | | | | | | | | | Currently we have pgo options defined in PassManagerBuilder.cpp only for instrument pgo, but not for sample pgo. We also have pgo options defined in NewPMDriver.cpp in opt only for new pass manager and for all kinds of pgo. They have some inconsistency. To make the options more consistent and make tests writing easier, the patch let old pass manager to share the same pgo options with new pass manager in opt, and removes the options in PassManagerBuilder.cpp. Differential Revision: https://reviews.llvm.org/D56749 llvm-svn: 351392
* [WebAssembly] Remove expected failure from known_gcc_test_failures.txt. NFC.Sam Clegg2019-01-161-1/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D56809 llvm-svn: 351388
* [X86] Sink complex MCU CC helper to .cpp file from .h file, NFCReid Kleckner2019-01-162-57/+59
| | | | llvm-svn: 351384
* [X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlvCraig Topper2019-01-166-60/+95
| | | | | | | | | | | | | | Previously we used ISD::SHL and ISD::SRL to represent these in SelectionDAG. ISD::SHL/SRL interpret an out of range shift amount as undefined behavior and will constant fold to undef. While the intrinsics are defined to return 0 for out of range shift amounts. A previous patch added a special node for VPSRAV to produce all sign bits. This was previously believed safe because undefs frequently get turned into 0 either from the constant pool or a desire to not have a false register dependency. But undef is treated specially in some optimizations. For example, its ignored in detection of vector splats. So if the ISD::SHL/SRL can be constant folded and all of the elements with in bounds shift amounts are the same, we might fold it to single element broadcast from the constant pool. This would not put 0s in the elements with out of bounds shift amounts. We do have an existing InstCombine optimization to use shl/lshr when the shift amounts are all constant and in bounds. That should prevent some loss of constant folding from this change. Patch by zhutianyang and Craig Topper Differential Revision: https://reviews.llvm.org/D56695 llvm-svn: 351381
* [X86] Use X86ISD::BLENDV for blendv intrinsics. Replace vselect with blendv ↵Craig Topper2019-01-166-89/+87
| | | | | | | | | | | | just before isel table lookup. Remove vselect isel patterns. This cleans up the duplication we have with both intrinsic isel patterns and vselect isel patterns. This should also allow the intrinsics to get SimplifyDemandedBits support for the condition. I've switched the canonical pattern in isel to use the X86ISD::BLENDV node instead of VSELECT. Since it always seemed weird to move from BLENDV with its relaxed rules on condition bits to VSELECT which has strict rules about all bits of the condition element being the same. Its more correct to go from VSELECT to BLENDV. Differential Revision: https://reviews.llvm.org/D56771 llvm-svn: 351380
* AMDGPU: Adjust the chain for loads writing to the HI part of a register.Changpeng Fang2019-01-161-0/+45
| | | | | | | | | | | | | | Summary: For these loads that write to the HI part of a register, we should chain them to the op that writes to the LO part of the register to maintain the appropriate order. Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D56454 llvm-svn: 351379
* [X86] Add a one use check to the setcc inversion code in ↵Craig Topper2019-01-161-2/+2
| | | | | | | | | | combineVSelectWithAllOnesOrZeros If we're going to generate a new inverted setcc, we should make sure we will be able to remove the old setcc. Differential Revision: https://reviews.llvm.org/D56765 llvm-svn: 351378
* [COFF, ARM64] Implement support for SEH extensions __try/__except/__finallyMandeep Singh Grang2019-01-167-9/+90
| | | | | | | | | | | | | | | | | Summary: This patch supports MS SEH extensions __try/__except/__finally. The intrinsics localescape and localrecover are responsible for communicating escaped static allocas from the try block to the handler. We need to preserve frame pointers for SEH. So we create a new function/property HasLocalEscape. Reviewers: rnk, compnerd, mstorsjo, TomTan, efriedma, ssijaric Reviewed By: rnk, efriedma Subscribers: smeenai, jrmuizel, alex, majnemer, ssijaric, ehsan, dmajor, kristina, javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53540 llvm-svn: 351370
* [Hexagon] Do not promote terminator instructions in Hexagon loop idiomsKrzysztof Parzyszek2019-01-161-1/+3
| | | | llvm-svn: 351369
* [X86][BtVer2] Update latency of horizontal operations.Andrea Di Biagio2019-01-161-4/+6
| | | | | | | | | | | | | | | | On Jaguar, horizontal adds/subs have local forwarding disable. That means, we pay a compulsory extra cycle of write-back stage, and the value is not available until the end of that stage. This patch changes the latency of horizontal operations by adding an extra cycle. With this patch, latency numbers now match what is reported by perf. I plan to send another patch to also 'fix' the latency of shuffle operations (on Jaguar, local forwarding is disabled for vector shuffles too). Differential Revision: https://reviews.llvm.org/D56777 llvm-svn: 351366
* [X86] getFauxShuffleMask - bail for non-byte aligned shuffle typesSimon Pilgrim2019-01-161-2/+2
| | | | | | | | Remove the existing assertion and just return false for unexpected shuffle value types (<X x i1> mainly....). Found while updating combineX86ShufflesRecursively to run within SimplifyDemandedVectorElts/SimplifyDemandedBits. llvm-svn: 351365
* [DebugInfo] Allow creation of DBG_VALUEs in blocks where the operand is not usedJeremy Morse2019-01-161-5/+6
| | | | | | | | | | | | | dbg.value intrinsics can appear in blocks where their operand is not used, meaning the operand never receives an SDNode, and thus no DBG_VALUE will be created. Get around this by looking to see whether the operand has already been allocated a virtual register. This allows dbg.values of Phi node and Values that are used across basic blocks to successfully be translated into DBG_VALUEs. Differential Revision: https://reviews.llvm.org/D56678 llvm-svn: 351358
* [X86] Add combineX86ShufflesRecursively helper. NFCI.Simon Pilgrim2019-01-161-23/+15
| | | | | | | | | | combineX86ShufflesRecursively is pretty cumbersome with a lot of arguments that only matter later in recursion. This commit adds a wrapper version that only takes the initial root Op to simplify calls that don't need to worry about these. An early, cleanup step towards merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts/SimplifyDemandedBits. llvm-svn: 351352
* AMDGPU: Add llvm.amdgcn.ds.ordered.add & swapMarek Olsak2019-01-1612-11/+119
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52944 llvm-svn: 351351
* [SLP] Fix PR40310: The reduction nodes should stay scalar.Alexey Bataev2019-01-161-1/+2
| | | | | | | | | | | | | | | | | Summary: Sometimes the SLP vectorizer tries to vectorize the horizontal reduction nodes during regular vectorization. This may happen inside of the loops, when there are some vectorizable PHIs. Patch fixes this by checking if the node is the reduction node and thus it must not be vectorized, it must be gathered. Reviewers: RKSimon, spatel, hfinkel, fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56783 llvm-svn: 351349
* [x86] lower shuffle of extracts to AVX2 vperm instructionsSanjay Patel2019-01-161-22/+91
| | | | | | | | | | | | | | | | I was trying to prevent shuffle regressions while matching more horizontal ops and ended up here: shuf (extract X, 0), (extract X, 4), Mask --> extract (shuf X, undef, Mask'), 0 The affected tests were added for: https://bugs.llvm.org/show_bug.cgi?id=34380 This patch won't change the examples in the bug report itself, but we should be able to extend this to catch more types. Differential Revision: https://reviews.llvm.org/D56756 llvm-svn: 351346
* [MSP430] Emit a separate section for every interrupt vectorAnton Korobeynikov2019-01-161-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | This is LLVM part of D56663 Linker scripts shipped by TI require to have every interrupt vector in a separate section with a specific name: SECTIONS { __interrupt_vector_XX : { KEEP (*(__interrupt_vector_XX )) } > VECTXX ... } Follow the requirement emit the section for every vector which contain address of interrupt handler: .section __interrupt_vector_XX,"ax",@progbits .word %isr% Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56664 llvm-svn: 351345
* Assertion in isAllocaPromotable due to extra bitcast goes into lifetime markerGabor Buella2019-01-161-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For the given test SROA detects possible replacement and creates a correct alloca. After that SROA is adding lifetime markers for this new alloca. The function getNewAllocaSlicePtr is trying to deduce the pointer type based on the original alloca, which is split, to use it later in lifetime intrinsic. For the test we ended up with such code (rA is initial alloca [10 x float], which is split, and rA.sroa.0.0 is a new split allocation) ``` %rA.sroa.0.0.rA.sroa_cast = bitcast i32* %rA.sroa.0 to [10 x float]* <----- this one causing the assertion and is an extra bitcast %5 = bitcast [10 x float]* %rA.sroa.0.0.rA.sroa_cast to i8* call void @llvm.lifetime.start.p0i8(i64 4, i8* %5) ``` isAllocaPromotable code assumes that a user of alloca may go into lifetime marker through bitcast but it must be the only one bitcast to i8* type. In the test it's not a i8* type, return false and throw the assertion. As we are creating a pointer, which will be used in lifetime markers only, the proposed fix is to create a bitcast to i8* immediately to avoid extra bitcast creation. The test is a greatly simplified to just reproduce the assertion. Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com> Reviewers: chandlerc, craig.topper Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D55934 llvm-svn: 351325
* [MSan] Apply the ctor creation scheme of TSanPhilip Pfaffe2019-01-161-1/+23
| | | | | | | | | | | | Summary: To avoid adding an extern function to the global ctors list, apply the changes of D56538 also to MSan. Reviewers: chandlerc, vitalybuka, fedor.sergeev, leonardchan Subscribers: hiraditya, bollu, llvm-commits Differential Revision: https://reviews.llvm.org/D56734 llvm-svn: 351322
* [SelectionDAG] Update check in createOperands to reflect max() is a valid value.Florian Hahn2019-01-161-1/+1
| | | | | | | | | | | | | | | The value returned by max() is the last valid value, adjust the comparison accordingly. The code added in D55073 creates TokenFactors with max() operands. Reviewers: aemerson, efriedma, RKSimon, craig.topper Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D56738 llvm-svn: 351318
* [Support] Remove error return value from one overload of fs::make_absolutePavel Labath2019-01-162-18/+18
| | | | | | | | | | | | | | | | | | | | | | Summary: The version of make_absolute which accepted a specific directory to use as the "base" for the computation could never fail, even though it returned a std::error_code. The reason for that seems to be historical -- the CWD flavour (which can fail due to failure to retrieve CWD) was there first, and the new version was implemented by extending that. This removes the error return value from the non-CWD overload and reimplements the CWD version on top of that. This enables us to remove some dead code where people were pessimistically trying to handle the errors returned from this function. Reviewers: zturner, sammccall Subscribers: hiraditya, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D56599 llvm-svn: 351317
* [NewPM][TSan] Reiterate the TSan portPhilip Pfaffe2019-01-165-35/+88
| | | | | | | | | | | | | | | | | | | Summary: Second iteration of D56433 which got reverted in rL350719. The problem in the previous version was that we dropped the thunk calling the tsan init function. The new version keeps the thunk which should appease dyld, but is not actually OK wrt. the current semantics of function passes. Hence, add a helper to insert the functions only on the first time. The helper allows hooking into the insertion to be able to append them to the global ctors list. Reviewers: chandlerc, vitalybuka, fedor.sergeev, leonardchan Subscribers: hiraditya, bollu, llvm-commits Differential Revision: https://reviews.llvm.org/D56538 llvm-svn: 351314
* [DAGCombine] Fix ReduceLoadWidth for shifted offsetsSam Parker2019-01-161-12/+8
| | | | | | | | | | | | ReduceLoadWidth can trigger using a shifted mask is used and this requires that the function return a shl node to correct for the offset. However, the way that this was implemented meant that the returned result could be an existing node, which would be incorrect. This fixes the method of inserting the new node and replacing uses. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 351310
OpenPOWER on IntegriCloud