summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [RISCV] atomic_store_nn have a different layout to regular storeRoger Ferrer Ibanez2018-08-271-4/+15
| | | | | | | | | | | We cannot directy reuse the patterns of StPat because for some reason the store DAG node and the atomic_store_nn DAG nodes put the ptr and the value in different positions. Currently we attempt to store the address to an address formed by the value. Differential Revision: https://reviews.llvm.org/D51217 llvm-svn: 340722
* Fix this file to have the necessary standard library includes and useChandler Carruth2018-08-271-2/+2
| | | | | | the `std::` namespace. Should fix a number of build bots as well. llvm-svn: 340721
* [X86] Cleanup the LowerMULH code by hoisting some commonalities between the ↵Craig Topper2018-08-271-24/+20
| | | | | | | | | | vXi32 and vXi8 handling. NFCI vXi32 support was recently moved from LowerMUL_LOHI to LowerMULH. This commit shares the getOperand calls, switches both to use common IsSigned flag, and hoists the NumElems/NumElts variable. llvm-svn: 340720
* [MS Demangler] Add virtual destructor.Zachary Turner2018-08-271-0/+1
| | | | | | Silence -Wnon-virtual-dtor. llvm-svn: 340711
* [MS Demangler] Re-write the Microsoft demangler.Zachary Turner2018-08-274-1824/+2181
| | | | | | | | | | | | | | | | | | | | This is a pretty large refactor / re-write of the Microsoft demangler. The previous one was a little hackish because it evolved as I was learning about all the various edge cases, exceptions, etc. It didn't have a proper AST and so there was lots of custom handling of things that should have been much more clean. Taking what was learned from that experience, it's now re-written with a completely redesigned and much more sensible AST. It's probably still not perfect, but at least it's comprehensible now to someone else who wants to come along and make some modifications or read the code. Incidentally, this fixed a couple of bugs, so I've enabled the tests which now pass. llvm-svn: 340710
* [X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F.Craig Topper2018-08-261-0/+1
| | | | | | | | | | | | | | Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51267 llvm-svn: 340708
* [X86] Add FeatureCMOV explicitly to all CPUs that support it. Remove ↵Craig Topper2018-08-262-17/+34
| | | | | | | | | | | | | | | | | | | | | | | | | FeatureCMOV implication from Feature64Bit and FeatureSSE1 Summary: Previously most CPUs inherited cmov support through Feature64Bit(or FeatureCMPXCHG16HB implying Feature64Bit) or FeatureSSE1. This has the surprising side effect that -mattr=-cmov causes an assert to fire in 64-bit mode because it clears the Feature64Bit. Or in 32-bit mode, -mattr=-cmov disables any sse/avx features which seems surprising. This patch removes the implication and instead updates hasCMOV in X86Subtarget to check SSE1 or is64Bit in addition to the regular cmov flag. This should keep most things working the way they did before. I don't believe there is a way to specific "-cmov" directly from clang so this should only effect our lower level tools. This does stop -mattr=cx16(cmpxchg16b) from implying cmov is enabled via the 64bit flag as you can see from one of the changed tests. But that was a 32-bit test so I don't know why it enabled cx16 anyway. For the other test I had to add -sse to override the new sse check in hasCMOV. Reviewers: RKSimon, DavidKreitzer, spatel Reviewed By: RKSimon Subscribers: llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D51228 llvm-svn: 340707
* [X86] Add FeatureCMOV to athlon and athlon-tbird cpus.Craig Topper2018-08-261-1/+1
| | | | | | | | | | | | | | Summary: This matches gcc and one cpuid dump I found online. Given that these are considered 7th generation x86 CPU it seems likely they support cmov since cmov was added by Intel in their 6th generation. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51264 llvm-svn: 340706
* [SelectionDAG][x86] turn insertelement into undef with variable index into splatSanjay Patel2018-08-263-3/+18
| | | | | | | | | | | | | | | | | | I noticed this along with the patterns in D51125, but when the index is variable, we don't convert insertelement into a build_vector. For x86, that means these get expanded at legalization time into the loading/spilling code that we see in the tests. I think it's always better to avoid going to memory on these, and we get the optimal 'broadcast' if it's available. I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have regression tests that would be affected (although I did not check what would happen in those cases). In the most basic cases shown here, AArch64 would probably do much better with a splat. Differential Revision: https://reviews.llvm.org/D51186 llvm-svn: 340705
* [ORC] Do not include non-global symbols in getObjectSymbolFlags.Lang Hames2018-08-261-11/+16
| | | | | | | | | | | Private symbols are not visible outside the object file, and so not defined by the object file from ORC's perspective. No test case yet. Ideally this would be a unit test parsing a checked-in binary, but I am not aware of any way to reference the LLVM source root from a unit test. llvm-svn: 340703
* [IR] Replace `isa<TerminatorInst>` with `isTerminator()`.Chandler Carruth2018-08-2632-46/+47
| | | | | | | | | | | | This is a bit awkward in a handful of places where we didn't even have an instruction and now we have to see if we can build one. But on the whole, this seems like a win and at worst a reasonable cost for removing `TerminatorInst`. All of this is part of the removal of `TerminatorInst` from the `Instruction` type hierarchy. llvm-svn: 340701
* Avoid specializing a variadic member template in a way that seems to notChandler Carruth2018-08-261-15/+13
| | | | | | | | | agree with MSVC. There isn't actually a need for specialization here as we can write the code generically and just have a test that will fold away as a constant. llvm-svn: 340700
* [IR] Sink `isExceptional` predicate to `Instruction`, rename it toChandler Carruth2018-08-267-7/+7
| | | | | | | | | | | `isExceptionalTermiantor` and implement it for opcodes as well following the common pattern in `Instruction`. Part of removing `TerminatorInst` from the `Instruction` type hierarchy to make it easier to share logic and interfaces between instructions that are both terminators and not terminators. llvm-svn: 340699
* [IR] Begin removal of TerminatorInst by removing successor manipulation.Chandler Carruth2018-08-2614-58/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The core get and set routines move to the `Instruction` class. These routines are only valid to call on instructions which are terminators. The iterator and *generic* range based access move to `CFG.h` where all the other generic successor and predecessor access lives. While moving the iterator here, simplify it using the iterator utilities LLVM provides and updates coding style as much as reasonable. The APIs remain pointer-heavy when they could better use references, and retain the odd behavior of `operator*` and `operator->` that is common in LLVM iterators. Adjusting this API, if desired, should be a follow-up step. Non-generic range iteration is added for the two instructions where there is an especially easy mechanism and where there was code attempting to use the range accessor from a specific subclass: `indirectbr` and `br`. In both cases, the successors are contiguous operands and can be easily iterated via the operand list. This is the first major patch in removing the `TerminatorInst` type from the IR's instruction type hierarchy. This change was discussed in an RFC here and was pretty clearly positive: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html There will be a series of much more mechanical changes following this one to complete this move. Differential Revision: https://reviews.llvm.org/D47467 llvm-svn: 340698
* [MIPS GlobalISel] Legalize i8 and i16 addPetar Jovanovic2018-08-261-1/+3
| | | | | | | | | | | | Legalize G_ADD for types smaller than i32. LegalizationArtifactCombiner replaces extend instructions with appropriate bitwise instructions. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D51213 llvm-svn: 340697
* [X86] Fix typo in comment, expect->except. NFCCraig Topper2018-08-261-1/+1
| | | | llvm-svn: 340695
* [C-API][DIBuilder] Use NameLen in LLVMDIBuilderCreateParameterVariableRobert Widmann2018-08-251-1/+1
| | | | | | | | | | | | | | Summary: NameLen wasn't being used and caused the parameters in gdb to very long, in my case, crashes in others. Please also perform the correct magical incarnations to have this be applied to the LLVM 7 branch. Reviewers: whitequark, CodaFi Reviewed By: CodaFi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51141 llvm-svn: 340691
* [X86] Replace support for vXi32 SMUL_LOHI/UMUL_LOHI with MULHS/MULHU support ↵Craig Topper2018-08-251-108/+75
| | | | | | | | | | | | | | | | | | | instead. Summary: The only time vector SMUL_LOHI/UMUL_LOHI nodes are created is during division/remainder lowering. If its created before op legalization, generic DAGCombine immediately turns that SMUL_LOHI/UMUL_LOHI into a MULHS/MULHU since only the upper half is used. That node will stick around through vector op legalization and will be turned back into UMUL_LOHI/SMUL_LOHI during op legalization. It will then be custom lowered by the X86 backend. Due to this two step lowering the vector shuffles created by the custom lowering get legalized after their inputs rather than before. This prevents the shuffles from being combined with any build_vector of constants. This patch uses changes vXi32 to use MULHS/MULHU instead. This is what the later DAG combine did anyway. But by skipping the change back to UMUL_LOHI/SMUL_LOHI we lower it before any constant BUILD_VECTORS. This allows the vector_shuffle creation to constant fold with the build_vectors. This accounts for the test changes here. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51254 llvm-svn: 340690
* [SelectionDAG][X86] Reorder the operands the MaskedStoreSDNode to put the ↵Craig Topper2018-08-257-52/+34
| | | | | | | | | | | | | | | | | | | | | value first. Summary: Previously the value being stored is the last operand in SDNode. This causes the type legalizer to visit the mask operand before the value operand. The type legalizer was more complicated because of this since we want the type of the value to drive the decisions. This patch moves the value to be the first operand so we visit it first during type legalization. It also simplifies the type legalization code accordingly. X86 is currently the only in tree target that uses this SDNode. Not sure if there are any users out of tree. Reviewers: RKSimon, delena, hfinkel, eli.friedman Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50402 llvm-svn: 340689
* [X86] Make sure type is a vector before calling VT.getVectorNumElements() in ↵Craig Topper2018-08-251-1/+1
| | | | | | | | combineLoopMAddPattern Fixes PR38700. llvm-svn: 340688
* Fix -Wunused-function warning. NFCI.Simon Pilgrim2018-08-251-9/+0
| | | | llvm-svn: 340687
* Remove superfluous semicolon. NFCI.Simon Pilgrim2018-08-251-1/+1
| | | | llvm-svn: 340686
* [x86] try harder to use broadcast to load a scalar into vector regSanjay Patel2018-08-251-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preliminary step for a preliminary step for D50992. I noticed that x86 often misses chances to load a scalar directly into a vector register. So this patch is just allowing more of those cases to match a broadcast op in lowerBuildVectorAsBroadcast(). The old code comment said it doesn't make sense to use a broadcast when we're loading a single element and everything else is undef, but I think that's the best case in the improved tests in insert-loaded-scalar.ll. We avoid scalar-to-vector-register move and/or less efficient shuffling. Note that there are some existing types that were already producing a broadcast, but that happens semi-accidentally. Ie, it's not happening as part of lowerBuildVectorAsBroadcast(). The build vector gets expanded into load + shuffle, and then shuffle lowering produces the broadcast. Description of the other test diffs: 1. avx-basic.ll - replacing load+shufle is a win. 2. sse3-avx-addsub-2.ll - vmovddup vs. vbroadcastss is neutral 3. sse41.ll - don't care - we convert that intrinsic to generic IR now, so this test is deprecated 4. vector-shuffle-128-v8.ll / vector-shuffle-256-v16.ll - pshufb alternatives with an extra instruction are not obviously bad Differential Revision: https://reviews.llvm.org/D51125 llvm-svn: 340685
* [AMDGPU] Add support for multi-dword s.buffer.load intrinsicTim Renouf2018-08-257-25/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Patch by Marek Olsak and David Stuttard, both of AMD. This adds a new amdgcn intrinsic supporting s.buffer.load, in particular multiple dword variants. These are convenient to use from some front-end implementations. Also modified the existing llvm.SI.load.const intrinsic to common up the underlying implementation. This modification also requires that we can lower to non-uniform loads correctly by splitting larger dword variants into sizes supported by the non-uniform versions of the load. V2: Addressed minor review comments. V3: i1 glc is now i32 cachepolicy for consistency with buffer and tbuffer intrinsics, plus fixed formatting issue. V4: Added glc test. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51098 Change-Id: I83a6e00681158bb243591a94a51c7baa445f169b llvm-svn: 340684
* [CodeGen] Set FrameSetup/FrameDestroy on BUNDLE instructionsBjorn Pettersson2018-08-251-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If any of the bundled instructions are marked as FrameSetup or FrameDestroy, then that property is set on the BUNDLE instruction as well. As long as the scheduler/packetizer aren't mixing prologue/epilogue instructions (i.e. all the bundled instructions have the same property) then this simply gives the bundle the correct property (so when using a bundle iterator in late passes a bundle will be correctly identified as FrameSetup/FrameDestroy). When for example bundling a mix of FrameSetup instructions with non-FrameSetup instructions it could be discussed if the bundle should have the property or not. The choice here has been to set these properties on the BUNDLE instruction if any of the bundled instructions have the property set. Reviewers: #debug-info, kparzysz Reviewed By: kparzysz Subscribers: vsk, thegameg, llvm-commits Differential Revision: https://reviews.llvm.org/D50637 llvm-svn: 340680
* [LiveDebugVariables] Avoid faulty addDefsFromCopies in computeIntervalsBjorn Pettersson2018-08-251-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When computeIntervals is looking through COPY instruction to extend the location mapping for a debug variable it did not handle subregisters correctly. For example DBG_VALUE debug-use %0.sub_8bit_hi, ... %1:gr16 = COPY %0 was transformed into DBG_VALUE debug-use %0.sub_8bit_hi, ... %1:gr16 = COPY %0 DBG_VALUE debug-use %1, ... So the subregister index was missing in the added DBG_VALUE. As long as the subreg refered to the least significant bits of the superreg, then I guess we could get the correct result in a debugger even when referring to the superreg. But as in the example above when the subreg refers to other parts of the superreg, then debuginfo would be incorrect. I'm not sure exactly how to fix this properly, so this patch just avoids looking through the COPY when there is a subreg involved (for more info, see the FIXME added in the code). Reviewers: rnk, aprantl Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D50788 llvm-svn: 340679
* [MC, RISCV] Fixed StringRef Assertion `Index < Length && "Invalid index!"'Ana Pazos2018-08-251-1/+1
| | | | | | | | | | | | | | | | | | Summary: Handle the case IDVal is an empty string. This bug was uncovered by a LLVM MC Assembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: rnk Reviewed By: rnk Subscribers: rnk, niravd, pcc, peter.smith, asb, grosbach, llvm-commits, bcain, kito-cheng, shiva0217, rogfer01, PkmX Differential Revision: https://reviews.llvm.org/D50808 llvm-svn: 340678
* This patch adds support to LLVM for writing HermitCore ↵Eric Christopher2018-08-251-0/+2
| | | | | | | | | | | (https://hermitcore.org) ELF binaries. HermitCore is a POSIX-compatible kernel for running a single application in an isolated environment to get maximum performance and predictable runtime behavior. It can either be used bare-metal on hardware or a VM (Unikernel) or side by side to an existing Linux system (Multikernel). Due to the latter feature, HermitCore binaries are marked with ELFOSABI_STANDALONE to let the Linux ELF loader distinguish them from regular Unix/Linux binaries and load them using the HermitCore "proxy" tool. Patch by Colin Finck! llvm-svn: 340675
* [RISCV] Fixed Assertion`Kind == Immediate && "Invalid type access!"' failed.Ana Pazos2018-08-241-0/+18
| | | | | | | | | | | | | | | | | | Summary: Missing check for isImm() in some Immediate classes. This bug was uncovered by a LLVM MC Assembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: hiraditya, asb Reviewed By: hiraditya, asb Subscribers: llvm-commits, hiraditya, kito-cheng, shiva0217, rkruppe, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei Differential Revision: https://reviews.llvm.org/D50797 llvm-svn: 340674
* Prevent DILocation::getMergedLocation() from creating invalid metadata.Adrian Prantl2018-08-241-0/+5
| | | | | | | | | | | | | | | | The function's new implementation from r340583 had a bug in it that could cause an invalid scope to be generated when merging two DILocations with no common ancestor scope. This patch detects this situation and picks the scope of the first location. This is not perfect, because the scope is misleading, but on the other hand, this will be a line 0 location. rdar://problem/43687474 Differential Revision: https://reviews.llvm.org/D51238 llvm-svn: 340672
* Allow demangler's node allocator to fail, and bail out of the entireRichard Smith2018-08-241-5/+21
| | | | | | | | | | | | | | | | demangling process when it does. Use this to support a "lookup" query for the mangling canonicalizer that does not create new nodes. This could also be used to implement demangling with a fixed-size temporary storage buffer. Reviewers: erik.pilkington Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51003 llvm-svn: 340670
* [RISCV] Fix std::advance slownessAna Pazos2018-08-241-2/+1
| | | | | | | | | | | | | | | | | | Summary: It seems std::advance template is treating "-MFI.getCalleeSavedInfo().size()" as a large unsigned value", causing slowness. Thanks to Henrik Gustafsson for reporting the issue. Reviewers: asb Reviewed By: asb Subscribers: llvm-commits, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D51148 llvm-svn: 340669
* Add data structure to form equivalence classes of mangled names.Richard Smith2018-08-242-0/+308
| | | | | | | | | | | | | | | Summary: Given a set of equivalent name fragments, this mechanism determines whether two mangled names are equivalent. The intent is to use this for fuzzy matching of profile data against the program after certain refactorings are performed. Reviewers: erik.pilkington, dlj Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D50935 llvm-svn: 340663
* [PGO] add target md5sum in warning message for icallXinliang David Li2018-08-241-1/+2
| | | | | | Differential revision: http://reviews.llvm.org/D51193 llvm-svn: 340657
* DAG: Allow matching fminnum/fmaxnum from vselectMatt Arsenault2018-08-241-8/+27
| | | | llvm-svn: 340655
* Use unique_ptr to hold MCInstrInfoVitaly Buka2018-08-241-1/+2
| | | | llvm-svn: 340654
* Verifier: verify that a DILocation's scope is a DILocalScope.Adrian Prantl2018-08-241-0/+4
| | | | | | | | This fixes an assertion failure(!) in the Verifier. rdar://problem/43687474 llvm-svn: 340653
* [SafeStack] Set debug location for calls to __safestack_pointer_address.Eli Friedman2018-08-241-0/+4
| | | | | | | | | | | | Otherwise, the debug info is incorrect. On its own, this is mostly harmless, but the safe-stack also later inlines the call to __safestack_pointer_address, which leads to debug info with the wrong scope, which eventually causes an assertion failure (and incorrect debug info in release mode). Differential Revision: https://reviews.llvm.org/D51075 llvm-svn: 340651
* CodeGen: Add two more conditions for adding symbols to the ↵Peter Collingbourne2018-08-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | address-significance table. Firstly, require the symbol to be used within the module. If a symbol is unused within a module, then by definition it cannot be address-significant within that module. This condition is useful on all platforms because it could make symbol tables smaller -- without this change, emitting an address-significance table could cause otherwise unused undefined symbols to be added to the object file. But this change is necessary with COFF specifically in order to preserve the property that an unreferenced undefined symbol in an IR module does not result in a link failure. This is already the case for ELF because ELF linkers only reject links with unresolved symbols if there is a relocation to that symbol, but COFF linkers require all undefined symbols to be resolved regardless of relocations. So if a module contains an unreferenced undefined symbol, we need to make sure not to add it to the address-significance table (and thus the symbol table) in case it doesn't end up resolved at link time. Secondly, do not add dllimport symbols to the table. These symbols won't be able to be resolved because their definitions live in another module and are accessed via the IAT, and the address-significance table has no effect on other modules anyway. It wouldn't make sense to add the IAT entry symbol to the address-significance table either because the IAT entry isn't address-significant -- the generated code never takes its address. Differential Revision: https://reviews.llvm.org/D51199 llvm-svn: 340648
* DebugInfo: Fix skipping CUs in DWARFv5 debug_names tableDavid Blaikie2018-08-241-2/+5
| | | | | | | | | | My previoust test case had skipped CUs from one TU out of a two-TU LTO scenario, which meant the CU index wasn't needed (as it was unambiguous which CU a table entry applied to) - expanding the test to use 3 TUs, skipping one (so long as it's not the last one) shows the indexes are miscomputed. Fix that with a little indirection for the index. llvm-svn: 340646
* [PowerPC] Emit xscpsgndp instead of xxlor when copying floating point scalar ↵Stefan Pintilie2018-08-241-1/+1
| | | | | | | | | | | | | | | registers for P9 This patch will address using the xscpsgndp instruction to copy floating point scalar registers instead of the xxlor (specifically XXLORf) instruction that is currently used. Additionally, this patch of utilizing xscpsgndp will apply to P9, while pre-P9 will still use xxlor. Patch by amyk Differential Revision: https://reviews.llvm.org/D50004 llvm-svn: 340643
* Use unique_ptr.Joel Galenson2018-08-241-1/+2
| | | | llvm-svn: 340642
* [AST] Simplify code minorly using pattern match [NFC]Philip Reames2018-08-241-8/+4
| | | | llvm-svn: 340638
* [AArch64] Reject inline asm with FP registers when FP is disabled.Eli Friedman2018-08-241-0/+9
| | | | | | | | Otherwise, we would crash trying to deal with an illegal input. Differential Revision: https://reviews.llvm.org/D51202 llvm-svn: 340637
* [Support] Allow discarding a FileOutputBuffer without removing the memory ↵Martin Storsjo2018-08-241-0/+6
| | | | | | | | mapping Differential Revision: https://reviews.llvm.org/D51095 llvm-svn: 340634
* [X86] Teach combineLoopMAddPattern to handle cases where there is no loop ↵Craig Topper2018-08-241-21/+33
| | | | | | | | and the add has two multiply inputs Differential Revision: https://reviews.llvm.org/D50868 llvm-svn: 340631
* [DAGCombiner][Mips] Don't combine bitcast+store after LegalOperations when ↵Craig Topper2018-08-241-1/+1
| | | | | | | | | | | | | | the store is volatile, if the resulting store isn't Legal Previously we allowed the store to be Custom. But without knowing for sure that the Custom handling won't split the store, we shouldn't convert a volatile store. We also probably shouldn't be creating a store the requires custom handling after LegalizeOps. This could lead to an infinite loop if the custom handling was to insert a bitcast. Though I guess isStoreBitCastBeneficial could be used to block such a loop. The test changes here are due to the volatile part of this. The stores in the test are all volatile and i32 stores are marked custom, So we are no longer converting them This is related to D50491 where I was trying to allow some bitcasting of volatile loads Differential Revision: https://reviews.llvm.org/D50578 llvm-svn: 340626
* Revert [Inliner] Attribute callsites with inline remarksDavid Bolvansky2018-08-241-51/+8
| | | | llvm-svn: 340619
* [Inliner] Attribute callsites with inline remarksDavid Bolvansky2018-08-241-8/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites. An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//. For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons: remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6) remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites: define void @test1() { call void @bar(i1 true) #0 call void @bar(i1 false) #2 ret void } attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" } .. attributes #2 = { "inline-remark"="(cost=never): recursive" } Patch by: yrouban (Yevgeny Rouban) Reviewers: xbolva00, tejohnson, apilipenko Reviewed By: xbolva00, tejohnson Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D50435 llvm-svn: 340618
* [LICM] Hoist an invariant_start out of loops if there are no stores executed ↵Philip Reames2018-08-241-1/+3
| | | | | | | | | | before it Once the invariant_start is reached, we know that no instruction *after* it can modify the memory. So, if we can prove the location isn't read *between entry into the loop and the execution of the invariant_start*, we can execute the invariant_start before entering the loop. Differential Revision: https://reviews.llvm.org/D51181 llvm-svn: 340617
OpenPOWER on IntegriCloud