summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] No need to pass DataLayout to helper functions if we're ↵Craig Topper2017-07-061-7/+5
| | | | | | passing the InstCombiner object. We can just ask it for the DataLayout. NFC llvm-svn: 307333
* [InstCombine] Remove unused arguments from some helper functions. NFCCraig Topper2017-07-061-7/+5
| | | | llvm-svn: 307332
* [InstCombine] Change a couple helper functions to only take the IRBuilder as ↵Craig Topper2017-07-061-9/+10
| | | | | | an argument and not the whole InstCombiner object. NFC llvm-svn: 307331
* [ConstHoisting] choose to hoist when frequency is the same.Wei Mi2017-07-061-2/+11
| | | | | | | | | | | | The patch is to adjust the strategy of frequency based consthoisting: Previously when the candidate block has the same frequency with the existing blocks containing a const, it will not hoist the const to the candidate block. For that case, now we change the strategy to hoist the const if only existing blocks have more than one block member. This is helpful for reducing code size. Differential Revision: https://reviews.llvm.org/D35084 llvm-svn: 307328
* [NVPTX] Add lowering of i128 params.Michael Kuperstein2017-07-063-11/+32
| | | | | | | | | | | | | | | | | The patch adds support of i128 params lowering. The changes are quite trivial to support i128 as a "special case" of integer type. With this patch, we lower i128 params the same way as aggregates of size 16 bytes: .param .b8 _ [16]. Currently, NVPTX can't deal with the 128 bit integers: * in some cases because of failed assertions like ValVTs.size() == OutVals.size() && "Bad return value decomposition" * in other cases emitting PTX with .i128 or .u128 types (which are not valid [1]) [1] http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#fundamental-types Differential Revision: https://reviews.llvm.org/D34555 Patch by: Denys Zariaiev (denys.zariaiev@gmail.com) llvm-svn: 307326
* [COFF, AArch64] Set the private label prefix to .LMartin Storsjo2017-07-061-0/+2
| | | | | | | | | | This fixes calls to external functions starting with a capital L, fixing errors like this: fatal error: error in backend: assembler label 'LocalFree' can not be undefined Differential Revision: https://reviews.llvm.org/D35079 llvm-svn: 307317
* AMDGPU: Add macro fusion schedule DAG mutationMatt Arsenault2017-07-064-0/+86
| | | | | | Try to increase opportunities to shrink vcc uses. llvm-svn: 307313
* AMDGPU: Minor cleanup of shrinking logicMatt Arsenault2017-07-061-8/+4
| | | | llvm-svn: 307312
* [AMDGPU] Always use rcp + mul with fast mathStanislav Mekhanoshin2017-07-062-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Regardless of relaxation options such as -cl-fast-relaxed-math we are producing rather long code for fdiv via amdgcn_fdiv_fast intrinsic. This intrinsic is used to replace fdiv with 2.5ulp metadata and does not handle denormals, thus believed to be fast. An fdiv instruction can also have fast math flag either by itself or together with fpmath metadata. Clang used with a relaxation flag always produces both metadata and fast flag: %div = fdiv fast float %v, %0, !fpmath !12 !12 = !{float 2.500000e+00} Current implementation ignores fast flag and favors metadata. An instruction with just fast flag would be lowered to a fastest rcp + mul, but that never happen on practice because of described mutual clang and BE behavior. This change allows an "fdiv fast" to be always lowered as rcp + mul. Differential Revision: https://reviews.llvm.org/D34844 llvm-svn: 307308
* [lib/LTO] Add a comment to explain where we set the linkage in the summary.Davide Italiano2017-07-061-0/+5
| | | | | | Pointed out by Teresa! llvm-svn: 307305
* [ValueTracking] Support icmps fed by 'and' and 'or'.Chad Rosier2017-07-061-7/+32
| | | | | | | | | | This patch adds support for handling some forms of ands and ors in ValueTracking's isImpliedCondition API. PR33611 https://reviews.llvm.org/D34901 llvm-svn: 307304
* [LTO] Fix the interaction between linker redefined symbols and ThinLTODavide Italiano2017-07-062-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | This is the same as r304719 but for ThinLTO. The substantial difference is that in this case we don't have whole visibility, just the summary. In the LTO case, when we got the resolution for the input file we could just see if the linker told us whether a symbol was linker redefined (using --wrap or --defsym) and switch the linkage directly for the GV. Here, we have the summary. So, we record that the linkage changed from <whatever it was> to $weakany to prevent IPOs across this symbol boundaries and actually just switch the linkage at FunctionImport time. This patch should also fixes the lld bits (as all the scaffolding for communicating if a symbol is linker redefined should be there & should be the same), but I'll make sure to add some tests there as well. Fixes PR33192. Differential Revision: https://reviews.llvm.org/D35064 llvm-svn: 307303
* [GISel]: Enhance the MachineIRBuilder APIAditya Nandakumar2017-07-061-3/+2
| | | | | | | | | | | | | | | | | | | | Allows the MachineIRBuilder APIs to directly create registers (based on LLT or TargetRegisterClass) as well as accept MachineInstrBuilders and implicitly converts to register(with getOperand(0).getReg()). Eg usage: LLT s32 = LLT::scalar(32); auto C32 = Builder.buildConstant(s32, 32); auto Tmp = Builder.buildInstr(TargetOpcode::G_SUB, s32, C32, OtherReg); auto Tmp2 = Builder.buildInstr(Opcode, DstReg, Builder.buildConstant(s32, 31)); .... Only a few methods added for now. Reviewed by Tim llvm-svn: 307302
* Prototype: Reduce llvm-profdata merge memory usage furtherDavid Blaikie2017-07-062-20/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The InstrProfWriter already stores the name and hash of the record in the nested maps it uses for lookup while merging - this data is duplicated in the value within the maps. Refactor the InstrProfRecord to use a nested struct for the counters themselves so that InstrProfWriter can use this nested struct alone without the name or hash duplicated there. This work is incomplete, but enough to demonstrate the value (around a 50% decrease in memory usage for a large test case (10GB -> 5GB)). Though most of that decrease is probably from removing the SoftInstrProfError as well, but I haven't implemented a replacement for it yet. (it needs to go with the counters, because the operations on the counters - merging, etc, are where the failures are - unlike the name/hash which are totally unused by those counter-related operations and thus easy to split out) Ongoing discussion about removing SoftInstrProfError as a field of the InstrProfRecord is happening on the thread that added it - including the possibility of moving back towards an earlier version of that proposed patch that passed SoftInstrProfError through the various APIs, rather than as a member of InstrProfRecord. Reviewers: davidxl Differential Revision: https://reviews.llvm.org/D34838 llvm-svn: 307298
* [InstCombine] Remove include of DIBuilder.h and Dwarf.h as they don't appear ↵Craig Topper2017-07-061-2/+0
| | | | | | to be necessary. llvm-svn: 307295
* Modify constraints in `llvm::canReplaceOperandWithVariable`Leo Li2017-07-061-2/+8
| | | | | | | | | | | | | | | | | Summary: `Instruction::Switch`: only first operand can be set to a non-constant value. `Instruction::InsertValue` both the first and the second operand can be set to a non-constant value. `Instruction::Alloca` return true for non-static allocation. Reviewers: efriedma Reviewed By: efriedma Subscribers: srhines, pirama, llvm-commits Differential Revision: https://reviews.llvm.org/D34905 llvm-svn: 307294
* [Constants] Replace calls to ConstantInt::equalsInt(0)/equalsInt(1) with ↵Craig Topper2017-07-064-17/+17
| | | | | | isZero and isOne. NFCI llvm-svn: 307293
* [Constants] If we already have a ConstantInt*, prefer to use ↵Craig Topper2017-07-0625-45/+45
| | | | | | | | isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292
* [LoopUnrollRuntime] Bailout when multiple exiting blocks to the unique latch ↵Anna Thomas2017-07-061-4/+5
| | | | | | | | | | | | | | exit block Currently, we do not support multiple exiting blocks to the latch exit block. However, this bailout wasn't triggered when we had a unique exit block (which is the latch exit), with multiple exiting blocks to that unique exit. Moved the bailout so that it's triggered in both cases and added testcase. llvm-svn: 307291
* [InstCombine] Remove Builder argument from InstCombiner::tryFactorization. NFCCraig Topper2017-07-062-8/+6
| | | | | | Builder is already a member of the InstCombiner class so we can use it with passing it. llvm-svn: 307290
* Fix spelling in comments. NFCI.Simon Pilgrim2017-07-061-3/+3
| | | | llvm-svn: 307288
* Bitcode: Include any strings added to the string table in the module hash.Peter Collingbourne2017-07-061-6/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D35037 llvm-svn: 307286
* [SimplifyCFG] Move a portion of an if statement that should already be ↵Craig Topper2017-07-061-2/+2
| | | | | | | | | | | | | | | | implied to an assert Summary: In this code we got to Dom by following the predecessor link of BB. So it stands to reason that BB should also show up as a successor of Dom's terminator right? There isn't a way to have the CFG connect in only one direction is there? Reviewers: jmolloy, davide, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35025 llvm-svn: 307276
* [InstCombine] Change helper method to a file local static method. NFCCraig Topper2017-07-062-5/+5
| | | | llvm-svn: 307275
* [InstCombine] Clarify comment to mention other transform that it does. NFCCraig Topper2017-07-061-1/+2
| | | | llvm-svn: 307274
* [InstCombine] Add single use checks to SimplifyBSwap to ensure we are really ↵Craig Topper2017-07-061-4/+9
| | | | | | | | | | | | saving instructions Bswap isn't a simple operation so we need to make sure we are really removing a call to it before doing these simplifications. For the case when both LHS and RHS are bswaps I've allowed it to be moved if either LHS or RHS has a single use since that at least allows us to move it later where it might find another bswap to combine with and it decreases the use count on the other side so maybe the other user can be optimized. Differential Revision: https://reviews.llvm.org/D34974 llvm-svn: 307273
* [InstCombine] Don't create extra ConstantInt objects in foldSelectICmpAnd. NFCICraig Topper2017-07-061-19/+17
| | | | | | Instead just use APInt objects and only create a ConstantInt at the end if we need it for the Offset. llvm-svn: 307270
* [LSR] Narrow search space by filtering non-optimal formulae with the same ↵Wei Mi2017-07-061-0/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ScaledReg and Scale. When the formulae search space is huge, LSR uses a series of heuristic to keep pruning the search space until the number of possible solutions are within certain limit. The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs, which picks the register which is used by the most LSRUses and deletes the other formulae which don't use the register. This is a effective way to prune the search space, but quite often not a good way to keep the best solution. We saw cases before that the heuristic pruned the best formula candidate out of search space. To relieve the problem, we introduce a new heuristic called NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to reduce the search space while keeping the best formula, we want to keep as many formulae with different Scale and ScaledReg as possible. That is because the central idea of LSR is to choose a group of loop induction variables and use those induction variables to represent LSRUses. An induction variable candidate is often represented by the Scale and ScaledReg in a formula. If we have more formulae with different ScaledReg and Scale to choose, we have better opportunity to find the best solution. That is why we believe pruning search space by only keeping the best formula with the same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use two criteria to choose the best formula with the same Scale and ScaledReg. The first criteria is to select the formula using less non shared registers, and the second criteria is to select the formula with less cost got from RateFormula. The patch implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the last resort. Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help on the two improved internal cases we saw. Differential Revision: https://reviews.llvm.org/D34583 llvm-svn: 307269
* [X86][SSE4A] Add support for shuffle combining to INSERTQI.Simon Pilgrim2017-07-061-0/+16
| | | | llvm-svn: 307268
* Doxygen formatting. NFCIJoel Jones2017-07-061-2/+12
| | | | llvm-svn: 307263
* [MachineVerifier] Add check that tied physregs aren't different.Mikael Holmen2017-07-061-0/+8
| | | | | | | | | | | | | | | | Summary: Added MachineVerifier code to check register ties more thoroughly, especially so that physical registers that are tied are the same. This may help e.g. when creating MIR files. Original patch by Jesper Antonsson Reviewers: stoklund, sanjoy, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D34394 llvm-svn: 307259
* [X86][SSE] combineX86ShuffleChain - merge duplicate creations of integer ↵Simon Pilgrim2017-07-061-20/+12
| | | | | | mask types llvm-svn: 307257
* [X86][SSE] combineX86ShuffleChain - merge duplicate 'Zeroable' element masksSimon Pilgrim2017-07-061-20/+12
| | | | llvm-svn: 307255
* [X86][SSE4A] Add support for shuffle combining to EXTRQ.Simon Pilgrim2017-07-061-1/+28
| | | | llvm-svn: 307254
* [X86][SSE4A] Split EXTRQ/INSERTQ shuffle matching from lowering. NFCI.Simon Pilgrim2017-07-061-99/+112
| | | | | | First step toward supporting shuffle combining to EXTRQ/INSERTQ. llvm-svn: 307250
* Revert "Revert "Revert "[IndVars] Canonicalize comparisons between ↵Max Kazantsev2017-07-061-4/+0
| | | | | | | | | non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249
* [RegisterCoalescer] Fix for SubRange join unreachableDavid Stuttard2017-07-061-0/+28
| | | | | | | | | | | | | | | | Summary: During remat, some subranges might end up having invalid segments which caused problems for later coalescing. Added in a check to remove segments that are invalidated as part of the remat. See http://llvm.org/PR33524 Subscribers: MatzeB, qcolombet Differential Revision: https://reviews.llvm.org/D34391 llvm-svn: 307247
* [ARM] GlobalISel: Map s32 G_FCMP in reg bank selectDiana Picus2017-07-061-0/+14
| | | | | | Map hard G_FCMP operands to FPR and the result to GPR. llvm-svn: 307245
* Revert "Revert "[IndVars] Canonicalize comparisons between non-negative ↵Max Kazantsev2017-07-061-0/+4
| | | | | | | | | | | | | values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244
* [ARM] GlobalISel: Legalize G_FCMP for s32Diana Picus2017-07-063-0/+164
| | | | | | | | | | | | | | | | | | | | | This covers both hard and soft float. Hard float is easy, since it's just Legal. Soft float is more involved, because there are several different ways to handle it based on the predicate: one and ueq need not only one, but two libcalls to get a result. Furthermore, we have large differences between the values returned by the AEABI and GNU functions. AEABI functions return a nice 1 or 0 representing true and respectively false. GNU functions generally return a value that needs to be compared against 0 (e.g. for ogt, the value returned by the libcall is > 0 for true). We could introduce redundant comparisons for AEABI as well, but they don't seem easy to remove afterwards, so we do different processing based on whether or not the result really needs to be compared against something (and just truncate if it doesn't). llvm-svn: 307243
* [ARM] GlobalISel: Widen s1, s8, s16 G_CONSTANTDiana Picus2017-07-061-0/+2
| | | | | | Get the legalizer to widen small constants. llvm-svn: 307239
* Avoid constructing GlobalExtensions only to find out it is empty.Frederich Munch2017-07-061-4/+14
| | | | | | | | | | | | | | | | Summary: GlobalExtensions is dereferenced twice, once for iteration and then a check if it is empty. As a ManagedStatic this dereference forces it's construction which is unnecessary. Reviewers: efriedma, davide, mehdi_amini Reviewed By: mehdi_amini Subscribers: chapuni, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D33381 llvm-svn: 307229
* Revert "Revert "Revert "Switch external cvtres.exe for llvm's own resource ↵Eric Beckmann2017-07-051-1/+0
| | | | | | | | | | | | | | library.""" This reverts commit ae21ee0b6cacbc1efaf4d42502e71da2f0eb45c3. The initial revert was done in order to prevent ongoing errors on chromium bots such as CrWinClangLLD. However, this was done haphazardly and I didn't realize there were test and compilation failures, so this revert was reverted. Now that those have been fixed, we can revert the revert of the revert. llvm-svn: 307227
* Revert "Revert "Revert "Replace trivial use of external rc.exe by writing ↵Eric Beckmann2017-07-052-10/+17
| | | | | | | | | | | | | | our own .res file.""" This reverts commit 5fecbbbe5049665d86834cf69d8f75db4f392308. The initial revert was done in order to prevent ongoing errors on chromium bots such as CrWinClangLLD. However, this was done haphazardly and I didn't realize there were test and compilation failures, so this revert was reverted. Now that those have been fixed, we can revert the revert of the revert. llvm-svn: 307226
* [IR] Use CmpInst::isFPPredicate/isIntPredicate in a few other places. NFCCraig Topper2017-07-052-7/+6
| | | | llvm-svn: 307224
* [GlobalOpt] Remove unreachable blocks before optimizing a function.Davide Italiano2017-07-051-0/+18
| | | | | | | | | | | | | | | | | LLVM's definition of dominance allows instructions that are cyclic in unreachable blocks, e.g.: %pat = select i1 %condition, @global, i16* %pat because any instruction dominates an instruction in a block that's not reachable from entry. So, remove unreachable blocks from the function, because a) there's no point in analyzing them and b) GlobalOpt should otherwise grow some more complicated logic to break these cycles. Differential Revision: https://reviews.llvm.org/D35028 llvm-svn: 307215
* Fix libcall expansion creating DAG nodes with invalid type post type ↵Vadim Chugunov2017-07-052-12/+25
| | | | | | | | | | | | legalization. If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values. Patch by Jatin Bhateja and Eli Friedman. Differential Revision: https://reviews.llvm.org/D34240 llvm-svn: 307207
* [DependenceAnalysis] Make sure base objects are the same when comparing GEPsBrendon Cahoon2017-07-051-1/+2
| | | | | | | | | | | | | | | The dependence analysis was returning incorrect information when using the GEPs to compute dependences. The analysis uses the GEP indices under certain conditions, but was doing it incorrectly when the base objects of the GEP are aliases, but pointing to different locations in the same array. This patch adds another check for the base objects. If the base pointer SCEVs are not equal, then the dependence analysis should fall back on the path that uses the whole SCEV for the dependence check. This fixes PR33567. Differential Revision: https://reviews.llvm.org/D34702 llvm-svn: 307203
* [InstCombine] Use CmpInst::Predicate with m_Cmp instead of ↵Craig Topper2017-07-051-1/+1
| | | | | | | | ICmpInst::Predicate. NFC There isn't really an ICmpInst version so we're just accessing the CmpInst version through inheritance. llvm-svn: 307199
* [WebAssembly] Fix types for address taken functionsSam Clegg2017-07-055-18/+29
| | | | | | Differential Revision: https://reviews.llvm.org/D34966 llvm-svn: 307198
OpenPOWER on IntegriCloud