summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [LCG] Add the necessary functionality to the LazyCallGraph to support inlining.Chandler Carruth2016-10-121-1/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The basic inlining operation makes the following changes to the call graph: 1) Add edges that were previously transitive edges. This is always trivial and this patch gives the LCG helper methods to make this more convenient. 2) Remove the inlined edge. We had existing support for this, but it contained bugs that needed to be fixed. Testing in the same pattern as the inliner exposes these bugs very nicely. 3) Delete a function when it becomes dead because it is internal and all calls have been inlined. The LCG had no support at all for this operation, so this adds that support. Two unittests have been added that exercise this specific mutation pattern to the call graph. They were extremely effective in uncovering bugs. Sadly, a large fraction of the code here is just to implement those unit tests, but I think they're paying for themselves. =] This was split out of a patch that actually uses the routines to implement inlining in the new pass manager in order to isolate (with unit tests) the logic that was entirely within the LCG. Many thanks for the careful review from folks! There will be a few minor follow-up patches based on the comments in the review as well. Differential Revision: https://reviews.llvm.org/D24225 llvm-svn: 283982
* Revert "[libFuzzer] refactoring to speed things up, NFC"Daniel Jasper2016-10-122-20/+35
| | | | | | | | | | | This reverts commit r283946. This breaks when build with GCC: lib/Fuzzer/FuzzerTracePC.cpp:169:6: error: always_inline function might not be inlinable [-Werror=attributes] lib/Fuzzer/FuzzerTracePC.cpp:169:6: error: inlining failed in call to always_inline 'void fuzzer::TracePC::HandleCmp(void*, T, T) [with T = long unsigned int]': target specific option mismatch lib/Fuzzer/FuzzerTracePC.cpp:198:65: error: called from here llvm-svn: 283979
* [AArch64][InstrustionSelector] Teach the selector about G_BITCAST.Quentin Colombet2016-10-121-59/+2
| | | | llvm-svn: 283973
* [AArch64][InstructionSelector] Refactor the handling of copies.Quentin Colombet2016-10-121-26/+83
| | | | | | | | | | | | | | Although Copies are not specific to preISel, we still have to assign them a proper register class. However, given they are not constrained to anything we do not have to handle the source register at the copy. It will be properly mapped when reaching the related definition. In the process, the handlong of G_ANYEXT is slightly modified as those end up being selected as copy. The difference is that when register size do not match on both sides, we need to insert SUBREG_TO_REG operation, otherwise the post RA copy expansion will not be happy! llvm-svn: 283972
* [AArch64][MachineLegalizer] Mark more bitcasts as legal.Quentin Colombet2016-10-121-0/+3
| | | | | | Those are copies, we do not have to do any legalization action for them. llvm-svn: 283970
* Memory-SSA cleanup of clobbers interface, NFCSebastian Pop2016-10-122-19/+26
| | | | | | | | | This implements the cleanup that Danny asked to commit separately from the previous fix to GVN-hoist in https://reviews.llvm.org/D25476#inline-219818 Tested with ninja check on x86_64-linux. llvm-svn: 283967
* GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)Sebastian Pop2016-10-122-81/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a refreshed version of a patch that was reverted: it fixes the problems reported in both PR30216 and PR30499, and contains all the test-cases from both bugs. To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Tested on x86_64-linux with check and a test-suite run. Differential Revision: https://reviews.llvm.org/D25476 llvm-svn: 283965
* [PPCMIPeephole] Fix splat eliminationTim Shen2016-10-121-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In PPCMIPeephole, when we see two splat instructions, we can't simply do the following transformation: B = Splat A C = Splat B => C = Splat A because B may still be used between these two instructions. Instead, we should make the second Splat a PPC::COPY and let later passes decide whether to remove it or not: B = Splat A C = Splat B => B = Splat A C = COPY B Fixes PR30663. Reviewers: echristo, iteratee, kbarton, nemanjai Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25493 llvm-svn: 283961
* [DAG] Fix crash in build_vector -> vector_shuffle combineMichael Kuperstein2016-10-111-0/+5
| | | | | | | | Fixes a crash in the build_vector -> vector_shuffle combine when the first vector input is twice as wide as the output, and the second input vector is even wider. llvm-svn: 283953
* GlobalISel: support same-size casts on AArch64.Tim Northover2016-10-112-0/+75
| | | | | | | Mostly Ahmed's work again, I'm just sprucing things up slightly before committing. llvm-svn: 283952
* [libFuzzer] refactoring to speed things up, NFCKostya Serebryany2016-10-112-35/+20
| | | | llvm-svn: 283946
* Re-land "[Thumb] Save/restore high registers in Thumb1 pro/epilogues"Reid Kleckner2016-10-113-24/+375
| | | | | | | | | Reverts r283938 to reinstate r283867 with a fix. The original change had an ArrayRef referring to a destroyed temporary initializer list. Use plain C arrays instead. llvm-svn: 283942
* Next set of additional error checks for invalid Mach-O files for theKevin Enderby2016-10-111-0/+36
| | | | | | | | | load commands that uses the MachO::linker_option_command type but not used in llvm libObject code but used in llvm tool code. This includes just LC_LINKER_OPTION load command. llvm-svn: 283939
* Revert "[Thumb] Save/restore high registers in Thumb1 pro/epilogues"Reid Kleckner2016-10-113-369/+24
| | | | | | | | | | | | | | | | | | This reverts r283867. This appears to be an infinite loop: while (HiRegToSave != AllHighRegs.end() && CopyReg != AllCopyRegs.end()) { if (HiRegsToSave.count(*HiRegToSave)) { ... CopyReg = findNextOrderedReg(++CopyReg, CopyRegs, AllCopyRegs.end()); HiRegToSave = findNextOrderedReg(++HiRegToSave, HiRegsToSave, AllHighRegs.end()); } } llvm-svn: 283938
* GlobalISel: support selection of extend operations.Tim Northover2016-10-111-0/+99
| | | | | | Patch mostly by Ahmed Bougaca. llvm-svn: 283937
* MIRParser: allow types on registers with a RegBank.Tim Northover2016-10-111-1/+2
| | | | | | This fixes some GlobalISel regression tests. llvm-svn: 283936
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-113-41/+329
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934
* Avoid braced initialization for default member initializers for MSVC 2013Reid Kleckner2016-10-111-1/+1
| | | | llvm-svn: 283928
* Silence -Wunused-but-set-variable warningArnold Schwaighofer2016-10-111-0/+1
| | | | llvm-svn: 283927
* Re-submit r283823: Define DbiStreamBuilder::addDbgStream to add stream.Rui Ueyama2016-10-111-2/+30
| | | | | | | The previous commit was failing because we filled empty slots of the debug stream index with kInvalidStreamIndex. It should've been 0. llvm-svn: 283925
* [sanitizer-coverage] use private linkage for coverage guards, delete old ↵Kostya Serebryany2016-10-111-12/+4
| | | | | | commented-out code. llvm-svn: 283924
* Fix build error on LP64 platforms.Rui Ueyama2016-10-111-1/+2
| | | | llvm-svn: 283922
* [raw_ostream] Raise some helper functions out of raw_ostream.Zachary Turner2016-10-113-137/+200
| | | | | | | | | | Low level functionality to format numbers were embedded in the implementation of raw_ostream. I have need to use these through an interface other than the overloaded stream operators, so they need to be raised to a level that they can be used from either raw_ostream operators or other code. llvm-svn: 283921
* [AMDGPU] Refactor waitcnt encodingKonstantin Zhuravlyov2016-10-115-66/+171
| | | | | | | | | | | | | - Refactor bit packing/unpacking - Calculate bit mask given bit shift and bit width - Introduce function for decoding bits of waitcnt - Introduce function for encoding bits of waitcnt - Introduce function for getting waitcnt mask (instead of using bare numbers) - Introduce function fot getting max waitcnt(s) (instead of using bare numbers) Differential Revision: https://reviews.llvm.org/D25298 llvm-svn: 283919
* Allow Switch instruction to have extractProfTotalWeight called as it can ↵Dehao Chen2016-10-111-1/+2
| | | | | | terminate a basic block. (NFC) llvm-svn: 283918
* Fix "static initialization order fiasco" for the XCore Target.Mehdi Amini2016-10-116-15/+19
| | | | | | | | I fixed all the other Targets in r283702, and interestingly the sanitizers are only now "sometimes" catching this bug on the only one I missed. llvm-svn: 283914
* [Support] Fix undefined behavior in RandomNumberGenerator.Zachary Turner2016-10-111-4/+4
| | | | | | | | | This has existed pretty much forever AFAICT, but the code was never being exercised because nobody was using the class. A user of this class surfaced, and now we're breaking with UB. The code was obviously wrong, so it's fixed here. llvm-svn: 283912
* ARMMachineFunctionInfo.cpp: Add an initializer of ↵NAKAMURA Takumi2016-10-111-2/+2
| | | | | | | | ARMFunctionInfo::ReturnRegsCount in the explicit ctor. It caused crash since r283867. llvm-svn: 283909
* Reformat.NAKAMURA Takumi2016-10-111-4/+4
| | | | llvm-svn: 283908
* [DAG] add fold for masked negated sign-extended boolSanjay Patel2016-10-111-5/+11
| | | | | | | This enhances the fold added with: https://reviews.llvm.org/rL283900 llvm-svn: 283905
* [DAG] add fold for masked negated extended boolSanjay Patel2016-10-111-2/+15
| | | | | | | | | | | | | | The non-obvious motivation for adding this fold (which already happens in InstCombine) is that we want to canonicalize IR towards select instructions and canonicalize DAG nodes towards boolean math. So we need to recreate some folds in the DAG to handle that change in direction. An interesting implementation difference for cases like this is that InstCombine generally works top-down while the DAG goes bottom-up. That means we need to detect different patterns. In this case, the SimplifyDemandedBits fold prevents us from performing a zext to sext fold that would then be recognized as a negation of a sext. llvm-svn: 283900
* Silence unused warning in non-assert builds.Daniel Jasper2016-10-111-0/+1
| | | | llvm-svn: 283899
* AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.Changpeng Fang2016-10-114-3/+8
| | | | | | | | | | Differential Revision: http://reviews.llvm.org/D25454 Reviewers: tstellarAMD llvm-svn: 283893
* [cl] Don't print subcommand help when no subcommands present.Zachary Turner2016-10-111-4/+6
| | | | | | | | | | | | | | | | Previously we would print USAGE: <exe> [subcommand] [options] Even if no subcommands were present. This changes the output format to only print "[subcommand]" if there is at least one subcommand. Fixes llvm.org/pr30598 Patch by Serge Guelton llvm-svn: 283892
* [DAG] simplify logic; NFCSanjay Patel2016-10-111-8/+6
| | | | llvm-svn: 283885
* [DAG] hoist DL(N) and fix formatting; NFCSanjay Patel2016-10-111-25/+32
| | | | llvm-svn: 283884
* [DAG] fix formatting; NFCSanjay Patel2016-10-111-72/+68
| | | | llvm-svn: 283878
* [LCSSA] Implement linear algorithm for the isRecursivelyLCSSAFormIgor Laevsky2016-10-114-39/+49
| | | | | | | | For each block check that it doesn't have any uses outside of it's innermost loop. Differential Revision: https://reviews.llvm.org/D25364 llvm-svn: 283877
* [Thumb] Save/restore high registers in Thumb1 pro/epiloguesOliver Stannard2016-10-113-24/+368
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The high registers are not allocatable in Thumb1 functions, but they could still be used by inline assembly, so we need to save and restore the callee-saved high registers (r8-r11) in the prologue and epilogue. This is complicated by the fact that the Thumb1 push and pop instructions cannot access these registers. Therefore, we have to move them down into low registers before pushing, and move them back after popping into low registers. In most functions, we will have low registers that are also being pushed/popped, which we can use as the temporary registers for saving/restoring the high registers. However, this is not guaranteed, so we may need to push some extra low registers to ensure that the high registers can be saved/restored. For correctness, it would be sufficient to use just one low register, but if we have enough low registers available then we only need one push/pop instruction, rather than one per high register. We can also use the argument/return registers when they are not live, and the link register when saving (but not restoring), reducing the number of extra registers we need to push. There are still a few extreme edge cases where we need two push/pop instructions, because not enough low registers can be made live in the prologue or epilogue. In addition to the regression tests included here, I've also tested this using a script to generate functions which clobber different combinations of registers, have different numbers of argument and return registers (including variadic arguments), allocate different fixed sized objects on the stack, and do or don't use variable sized allocas and the __builtin_return_address intrinsic (all of which affect the available registers in the prologue and epilogue). I ran these functions in a test harness which verifies that all of the callee-saved registers are correctly preserved. Differential Revision: https://reviews.llvm.org/D24228 llvm-svn: 283867
* [ARM] Fix registers clobbered by SjLj EH on soft-float targetsOliver Stannard2016-10-114-2/+15
| | | | | | | | | | | | | | | | | | | Currently, the Int_eh_sjlj_dispatchsetup intrinsic is marked as clobbering all registers, including floating-point registers that may not be present on the target. This is technically true, as we could get linked against code that does use the FP registers, but that will not actually work, as the soft-float code cannot save and restore the FP registers. SjLj exception handling can only work correctly if either all or none of the code is built for a target with FP registers. Therefore, we can assume that, when Int_eh_sjlj_dispatchsetup is compiled for a soft-float target, it is only going to be linked against other soft-float code, and so only clobbers the general-purpose registers. This allows us to check that no non-savable registers are clobbered when generating the prologue/epilogue. Differential Revision: https://reviews.llvm.org/D25180 llvm-svn: 283866
* [AArch64] Allow label arithmetic with add/sub/cmpDiana Picus2016-10-113-26/+44
| | | | | | | | | | | | | Allow instructions such as 'cmp w0, #(end - start)' by folding the expression into a constant. For ELF, we fold only if the symbols are in the same section. For MachO, we fold if the expression contains only symbols that are not linker visible. Fixes https://llvm.org/bugs/show_bug.cgi?id=18920 Differential Revision: https://reviews.llvm.org/D23834 llvm-svn: 283862
* Fix formatting in findRegisterUseOperandIdx. NFC.Fraser Cormack2016-10-111-7/+5
| | | | llvm-svn: 283860
* Revert "Codegen: Tail-duplicate during placement."Daniel Jasper2016-10-113-330/+41
| | | | | | | | | This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857
* Make RandomNumberGenerator compatible with <random>Mehdi Amini2016-10-111-1/+1
| | | | | | | | | | | | | LLVM's RandomNumberGenerator wasn't compatible with the random distribution from <random>. Fixes PR25105 Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D25443 llvm-svn: 283854
* Tune isHotFunction/isColdFunctionDehao Chen2016-10-111-6/+2
| | | | | | | | | | | | Summary: This patch sets function as hot if function's entry count is hot/cold. Reviewers: eraman, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25048 llvm-svn: 283852
* Fix warning; NFCMatthias Braun2016-10-111-2/+2
| | | | llvm-svn: 283851
* MIRParser: generic register operands with typesMatthias Braun2016-10-112-2/+3
| | | | | | This should fix the fallout of r283848. llvm-svn: 283850
* MIRParser: Rewrite register info initialization; mostly NFCMatthias Braun2016-10-114-108/+179
| | | | | | | | | | | | | | | | | | | | | | This changes MachineRegisterInfo to be initializes after parsing all instructions. This is in preparation for upcoming commits that allow the register class specification on the operand or deduce them from the MCInstrDesc. This commit removes the unused feature of having nonsequential register numbers. This was confusing anyway as the vreg numbers would be different after parsing when you had "holes" in your numbering. This patch also introduces the concept of an incomplete virtual register. An incomplete virtual register may be used during .mir parsing to construct MachineOperands without knowing the exact register class (or register bank) yet. NFC except for some error messages. Differential Revision: https://reviews.llvm.org/D22397 llvm-svn: 283848
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-113-41/+330
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842
* [libFuzzer] implement value profile for switch, increase the size of the PCs ↵Kostya Serebryany2016-10-113-4/+11
| | | | | | array, make sure we don't overflow it llvm-svn: 283841
OpenPOWER on IntegriCloud