summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Move helpers into anonymous namespaces. NFC.Benjamin Kramer2016-08-0615-24/+32
| | | | llvm-svn: 277916
* [CodeGen] Fix a -Wdocumentation warningDavid Majnemer2016-08-061-1/+1
| | | | | | | A parameter was documented with the wrong name. No functionality change is intended. llvm-svn: 277915
* [ValueTracking] Teach computeKnownBits about [su]min/maxDavid Majnemer2016-08-061-1/+50
| | | | | | | Reasoning about a select in terms of a min or max allows us to derive a tigher bound on the result. llvm-svn: 277914
* [CallGraphSCCPass] Use an ArrayRef instead of a pair of iteratorsDavid Majnemer2016-08-061-1/+1
| | | | | | No functional change is intended. llvm-svn: 277913
* [InstCombine] Don't coerce non-integral pointers to integersSanjoy Das2016-08-061-1/+2
| | | | | | | | | | Reviewers: majnemer Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23231 llvm-svn: 277910
* Revert "(refs/bisect/bad) GVN-hoist: enable by default"Matthias Braun2016-08-061-2/+2
| | | | | | | | | | | GVN-Hoist appears to miscompile llvm-testsuite SingleSource/Benchmarks/Misc/fbench.c at the moment. I filed http://llvm.org/PR28880 This reverts commit r277786. llvm-svn: 277909
* Part 4c: Coroutine Devirtualization: Devirtualize coro.resume and coro.destroy.Gor Nishanov2016-08-066-35/+175
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the 4c patch of the coroutine series. CoroElide pass now checks if PostSplit coro.begin is referenced by coro.subfn.addr intrinsics. If so replace coro.subfn.addrs with an appropriate coroutine subfunction associated with that coro.begin. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) 3.Add empty coroutine passes. (https://reviews.llvm.org/D22847) 4.Add coroutine devirtualization + tests. ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998) c) Do devirtualization <= we are here 5.Add CGSCC restart trigger + tests. 6.Add coroutine heap elision + tests. 7.Add the rest of the logic (split into more patches) Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23229 llvm-svn: 277908
* Revert r277896.Nico Weber2016-08-061-1/+1
| | | | | | | | | | | | | | | | It breaks ExecutionEngine/OrcLazy/weak-function.ll on most bots. Script: -- ... -- Exit Code: 1 Command Output (stderr): -- Could not find main function. llvm-svn: 277907
* CodeGen: If Convert blocks that would form a diamond when tail-merged.Kyle Butt2016-08-061-66/+345
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 277905
* IfConverter: Split ScanInstructions into 2 functions.Kyle Butt2016-08-061-13/+27
| | | | | | | | | | ScanInstructions is now 2 functions: AnalyzeBranches and ScanInstructions. ScanInstructions also now takes a pair of arguments delimiting the instructions to be scanned. This will be used for forked diamond support to re-scan only a portion of the block. llvm-svn: 277904
* IfConversion: Document countDuplicatedInstructions. NFCKyle Butt2016-08-061-0/+12
| | | | llvm-svn: 277903
* IfConversion: factor out 2 functions to skip debug instrs. NFCKyle Butt2016-08-061-24/+32
| | | | | | Skipping debug instructions occurrs repeatedly, factor it out. llvm-svn: 277902
* Revert "[LoopSimplify] Fix updating LCSSA after separating nested loops."Michael Zolotukhin2016-08-061-15/+0
| | | | | | | This reverts commit r277877. Try to appease clang-x64-ninja-win7 buildbot. llvm-svn: 277901
* [ORC] Add (partial) weak symbol support to the CompileOnDemand layer.Lang Hames2016-08-061-1/+1
| | | | | | | | | | | | | | This adds partial support for weak functions to the CompileOnDemandLayer by modifying the addLogicalModule method to check for existing stub definitions before building a new stub for a weak function. This scheme is sufficient to support ODR definitions, but fails for general weak definitions if strong definition is encountered after the first weak definition. (A more extensive refactor will be required to fully support weak symbols). This patch does *not* add weak symbol support to RuntimeDyld: I hope to add that in the near future. llvm-svn: 277896
* [IRCE] Remove unused headers; NFCSanjoy Das2016-08-061-7/+0
| | | | llvm-svn: 277892
* [IRCE] Preserve loop-simplify formSanjoy Das2016-08-061-0/+2
| | | | | | | | Fixes PR28764. Right now there is no way to test this, but (as mentioned on the PR) with Michael Zolotukhin's yet to be checked in LoopSimplify verfier, 8 of the llvm-lit tests for IRCE crash. llvm-svn: 277891
* [InstCombine] refactor ctlz/cttz folds (NFCI)Sanjay Patel2016-08-051-34/+33
| | | | | | | | | | Note that this fold really belongs in InstSimplify. Refactoring here anyway as an intermediate step because there's a planned addition to this function in D23134. Differential Revision: https://reviews.llvm.org/D23223 llvm-svn: 277883
* [MSSA] Use depth first iterator instead of custom version.Daniel Berlin2016-08-051-19/+3
| | | | | | | | | | | | | | | | | Summary: Originally the plan was to use the custom worklist to do some block popping, and because we don't actually need a visited set. The custom one we have here is slightly broken, and it's not worth fixing vs using depth_first_iterator since we aren't going to go the route we originally were. Fixes PR28874 Reviewers: george.burgess.iv Subscribers: llvm-commits, gberry Differential Revision: https://reviews.llvm.org/D23187 llvm-svn: 277880
* CodeView: Remove an unused variableJustin Bogner2016-08-051-1/+0
| | | | | | It was breaking the -Werror build. llvm-svn: 277878
* [LoopSimplify] Fix updating LCSSA after separating nested loops.Michael Zolotukhin2016-08-051-0/+15
| | | | | | | | | This fixes PR28825. The problem was that we only checked if a value from a created inner loop is used in the outer loop, and fixed LCSSA for them. But we missed to fixup LCSSA for values used in exits of the outer loop. llvm-svn: 277877
* Fix non portable include path.Zachary Turner2016-08-051-1/+1
| | | | llvm-svn: 277876
* [MSSA] Match assert vs llvm_unreachable style in verification functions.Daniel Berlin2016-08-051-11/+12
| | | | llvm-svn: 277873
* Rewrite domination verifier to handle local domination as well.Daniel Berlin2016-08-051-38/+19
| | | | | | | | | | | | | | Summary: Rewrite domination verifier to handle local domination as well. This catches a bug Geoff Berry noticed. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23184 llvm-svn: 277872
* [CodeView] Decouple record deserialization from visitor dispatch.Zachary Turner2016-08-0512-207/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, our use case for the visitor has been to take a stream of bytes representing a type stream, deserialize the records in sequence, and do something with them, where "something" is determined by how the user implements a particular set of callbacks on an abstract class. For actually writing PDBs, however, we want to do the reverse. We have some kind of description of the list of records in their in-memory format, and we want to process each one. Perhaps by serializing them to a byte stream, or perhaps by converting them from one description format (Yaml) to another (in-memory representation). This was difficult in the current model because deserialization and invoking the callbacks were tightly coupled. With this patch we change this so that TypeDeserializer is itself an implementation of the particular set of callbacks. This decouples deserialization from the iteration over a list of records and invocation of the callbacks. TypeDeserializer is initialized with another implementation of the callback interface, so that upon deserialization it can pass the deserialized record through to the next set of callbacks. In a sense this is like an implementation of the Decorator design pattern, where the Deserializer is a decorator. This will be useful for writing Pdbs from yaml, where we have a description of the type records in Yaml format. In this case, the visitor implementation would have each visitation callback method implemented in such a way as to extract the proper set of fields from the Yaml, and it could maintain state that builds up a list of these records. Finally at the end we can pass this information through to another set of callbacks which serializes them into a byte stream. Reviewed By: majnemer, ruiu, rnk Differential Revision: https://reviews.llvm.org/D23177 llvm-svn: 277871
* AMDGPU/SI: Increase SGPR limit to 96 on Tonga/IcelandMarek Olsak2016-08-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the setting of the Vulkan closed source driver. It decreases the max wave count from 10 to 8. 26010 shaders in 14650 tests Totals: VGPRS: 829593 -> 808440 (-2.55 %) Spilled SGPRs: 81878 -> 42226 (-48.43 %) Spilled VGPRs: 367 -> 358 (-2.45 %) Scratch VGPRs: 1764 -> 1748 (-0.91 %) dwords per thread Code Size: 36677864 -> 35923932 (-2.06 %) bytes There is a massive decrease in SGPR spilling in general and -7.4% spilled VGPRs for DiRT Showdown (= SGPRs spilled to scratch?) Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23034 llvm-svn: 277867
* [ARM] Constant Materialize: imms with specific value can be encoded into mov.wWeiming Zhao2016-08-051-1/+3
| | | | | | | | | | | | | | | | | | Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. I'm resubmitting this patch. The test case in the original commit r277610 does not specify triple, so builds with differnt default triple will have different output. This patch fixed trile as thumb-darwin-apple. Reviewers: john.brawn, jmolloy, bruno Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277865
* [FlattenCFG] Simplify + remove unused variable. NFCI.Davide Italiano2016-08-051-7/+2
| | | | llvm-svn: 277864
* Remove cold callsite heuristic that is not necessary because of cold callee ↵Dehao Chen2016-08-051-7/+5
| | | | | | heuristic. llvm-svn: 277863
* Replace hot-callsite based heuristic to use its own threshold parameter ↵Dehao Chen2016-08-051-6/+17
| | | | | | | | | | | | | | instead of share inline-hint parameter Summary: Hot callsites should have higher threshold than inline hints. This patch uses separate threshold parameter for hot callsites. Reviewers: davidxl, eraman Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D22368 llvm-svn: 277860
* [sanitizers] trace buffer API to use user-allocated buffer.Mike Aizatsky2016-08-053-27/+53
| | | | | | Differential Revision: https://reviews.llvm.org/D23185 llvm-svn: 277859
* WholeProgramDevirt: print remarks with devirtualized method names.Ivan Krasin2016-08-051-2/+18
| | | | | | | | | | | | | | | | | | Summary: Chrome on Linux uses WholeProgramDevirt for speed ups, and it's important to detect regressions on both sides: the toolchain, if fewer methods get devirtualized after an update, and Chrome, if an innocently looking change caused many hot methods become virtual again. The need to track devirtualized methods is not Chrome-specific, but it's probably the only user of the pass at this time. Reviewers: kcc Differential Revision: https://reviews.llvm.org/D23219 llvm-svn: 277856
* [ADCE] Refactoring for new functionality (NFC)David Callahan2016-08-051-46/+84
| | | | | | | | | | | | | | Summary: This is another refactoring to break up the one function into three logical components functions. Another non-functional change before we start added in features. Reviewers: nadav, mehdi_amini, majnemer Subscribers: twoh, freik, llvm-commits Differential Revision: https://reviews.llvm.org/D23102 llvm-svn: 277855
* [ConstantFolding] Don't create illegal (non-integral) inttoptrsSanjoy Das2016-08-051-3/+4
| | | | | | | | | | Reviewers: majnemer, arsenm Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23182 llvm-svn: 277854
* [AutoFDO] Fix handling of empty profilesDavid Callahan2016-08-051-1/+4
| | | | | | | | | | | | | | | Summary: If a profile has no samples for a function, then the function "entry count" is set to the value 0. Several places in the code test that if the Function::getEntryCount is defined at all. Here we change to treat a 0 entry count the same as undefined. In particular, this fixes a problem in getLayoutSuccessorProbThreshold in MachineBlockPlacement.cpp where we use a different and inferior heuristic for laying out basic blocks. Reviewers: danielcdh, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23082 llvm-svn: 277849
* [SCEV] Don't infinitely recurse on unreachable codeSanjoy Das2016-08-051-1/+3
| | | | llvm-svn: 277848
* Add the first of what will be a long line of additional error checks for ↵Kevin Enderby2016-08-051-0/+5
| | | | | | | | | | | | | invalid Mach-O files. This is where an LC_SEGMENT load command has a fileoff field that extends past the end of the file. Also fix llvm-nm and llvm-size to remove the errorToErrorCode() call so error messages are printed. And needed to update a few test cases now that they do print the error messages just a bit differently. llvm-svn: 277845
* Do not assign new discriminator for all intrinsics.Dehao Chen2016-08-051-2/+2
| | | | | | | | | | | | Summary: We do not care about intrinsic calls when assigning discriminators. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23212 llvm-svn: 277843
* GlobalISel: clear pending phis after MachineFunction translatedTim Northover2016-08-051-0/+4
| | | | | | | Test is just reordering the existing functions (it would trigger for any function after one with a phi). llvm-svn: 277841
* [X86][SSE] Add initial support for 2 input target shuffle combining.Simon Pilgrim2016-08-051-52/+68
| | | | | | At the moment only the INSERTPS matching can actually use 2 inputs but the plumbing is now in place. llvm-svn: 277839
* GlobalISel: IRTranslate PHI instructionsTim Northover2016-08-052-0/+33
| | | | llvm-svn: 277835
* [PowerPC] Wrong fast-isel codegen for VSX floating-point loadsUlrich Weigand2016-08-051-12/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were two locations where fast-isel would generate a LFD instruction with a target register class VSFRC instead of F8RC when VSX was enabled. This can ccause invalid registers to be used in certain cases, like: lfd 36, ... instead of using a VSX load instruction. The wrong register number gets silently truncated, causing invalid code to be generated. The first place is PPCFastISel::PPCEmitLoad, which had multiple problems: 1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they are computed from resultReg, which is still zero at this point in many cases. Fixed by changing the helper routines to operate on a register class instead of a register and passing in UseRC. 2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo: bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS; bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD; The second line needs to use isVSFRC (like PPCEmitStore does). 3.) Once both the above are fixed, we're now generating a VSX instruction -- but an incorrect one, since generation of an indexed instruction with null index is wrong. Fixed by copying the code handling the same issue in PPCEmitStore. The second place is PPCFastISel::PPCMaterializeFP, where we would emit an LFD to load a constant from the literal pool, and use the wrong result register class. Fixed by hardcoding a F8RC class even on systems supporting VSX. Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630 Differential Revision: https://reviews.llvm.org/D22632 llvm-svn: 277823
* [SystemZ] Add missing classes and instructionsZhan Jun Liau2016-08-052-0/+104
| | | | | | | | | | | | | | | | Summary: Add instruction formats E, RSI, SSd, SSE, and SSF. Added BRXH, BRXLE, PR, MVCK, STRAG, and ECTG instructions to test out those formats. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23179 llvm-svn: 277822
* [SimplifyCFG] Make range reduction code deterministic.Benjamin Kramer2016-08-051-2/+3
| | | | | | | | | | | This generated IR based on the order of evaluation, which is different between GCC and Clang. With that in mind you get bootstrap miscompares if you compare a Clang built with GCC-built Clang vs. Clang built with Clang-built Clang. Diagnosing that made my head hurt. This also reverts commit r277337, which "fixed" the test case. llvm-svn: 277820
* [X86][SSE] Update the the target shuffle matches to use the effective mask's ↵Simon Pilgrim2016-08-051-31/+29
| | | | | | | | value type directly instead of via the input value type. Preparation for adding 2 input support so we want to avoid unnecessary references to the input value type. llvm-svn: 277817
* [X86][SSE] Consistently use the target shuffle root value type for vector ↵Simon Pilgrim2016-08-051-11/+12
| | | | | | | | size calculations. NFCI. Preparation for adding 2 input support so we want to avoid unnecessary references to the input value type. llvm-svn: 277814
* LLLexer.cpp: Avoid using BitsToDouble() to preserve SNaN like "double ↵NAKAMURA Takumi2016-08-051-1/+2
| | | | | | | | | 0x7FF4000000000000". We should not use double (or float) in the LLVM, unless it is really needed. x87 FP register doesn't preserve SNaN to move the value. FIXME: APFloat() may have the constructor by raw bit. llvm-svn: 277813
* Reformat.NAKAMURA Takumi2016-08-051-1/+1
| | | | llvm-svn: 277812
* [X86][SSE] Added target shuffle combine binary compute matching function. NFCI.Simon Pilgrim2016-08-051-72/+80
| | | | | | Added matchBinaryPermuteVectorShuffle and moved the blend+zero and insertps matching code into it. llvm-svn: 277808
* Reapply r276973 "Adjust Registry interface to not require plugins to export ↵John Brawn2016-08-052-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a registry" This differs from the previous version by being more careful about template instantiation/specialization in order to prevent errors when building with clang -Werror. Specifically: * begin is not defined in the template and is instead instantiated when Head is. I think the warning when we don't do that is wrong (PR28815) but for now at least do it this way to avoid the warning. * Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY instead provide a template definition then do explicit instantiation. No compiler I've tried has problems with doing it the other way, but strictly speaking it's not permitted by the C++ standard so better safe than sorry. Original commit message: Currently the Registry class contains the vestiges of a previous attempt to allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a plugin would have its own copy of a registry and export it to be imported by the tool that's loading the plugin. This only works if the plugin is entirely self-contained with the only interface between the plugin and tool being the registry, and in particular this conflicts with how IR pass plugins work. This patch changes things so that instead the add_node function of the registry is exported by the tool and then imported by the plugin, which solves this problem and also means that instead of every plugin having to export every registry they use instead LLVM only has to export the add_node functions. This allows plugins that use a registry to work on Windows if LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used. llvm-svn: 277806
* [PowerPC] fix passing long double arguments to function (soft-float)Strahinja Petrovic2016-08-053-0/+39
| | | | | | | | | | This patch fixes passing long double type arguments to function in soft float mode. If there is less than 4 argument registers free (long double type is mapped in 4 gpr registers in soft float mode) long double type argument must be passed through stack. Differential Revision: https://reviews.llvm.org/D20114. llvm-svn: 277804
OpenPOWER on IntegriCloud