summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Refactor replaceDominatedUsesWith to have a flag to control whether to ↵Dehao Chen2016-09-012-4/+6
| | | | | | | | | | | | | | replace uses in BB itself. Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking. Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24170 llvm-svn: 280427
* Refactor LICM pass in preparation for LoopSink pass.Dehao Chen2016-09-011-21/+23
| | | | | | | | | | | | Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24168 llvm-svn: 280425
* [Legalizer] Don't throw away false low half when expanding GT/LT SETCCMichael Kuperstein2016-09-011-9/+9
| | | | | | | | | | | When expanding a SETCC for which the low half is known to evaluate to false, we can only throw it away for LT/GT comparisons, not LE/GE. This fixes PR29170. Differential Revision: https://reviews.llvm.org/D24151 llvm-svn: 280424
* [SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizesMichael Kuperstein2016-09-012-45/+120
| | | | | | | | | | | | | | | | | | | | | | | | | Prior to this, we could generate a vector_shuffle from an IR shuffle when the size of the result was exactly the sum of the sizes of the input vectors. If the output vector was narrower - e.g. a <12 x i8> being formed by a shuffle with two <8 x i8> inputs - we would lower the shuffle to a sequence of extracts and inserts. Instead, we can form a larger vector_shuffle, and then extract a subvector of the right size - e.g. shuffle the two <8 x i8> inputs into a <16 x i8> and then extract a <12 x i8>. This also includes a target-specific X86 combine that in the presence of AVX2 combines: (vector_shuffle <mask> (concat_vectors t1, undef) (concat_vectors t2, undef)) into: (vector_shuffle <mask> (concat_vectors t1, t2), undef) in cases where this allows us to form VPERMD/VPERMQ. (This is not a separate commit, as that pattern does not appear without the DAGBuilder change.) llvm-svn: 280418
* [WebAssembly] Add asm.js-style setjmp/longjmp handling for wasm (reland r280302)Heejin Ahn2016-09-012-165/+776
| | | | | | | | | | | | Summary: This patch adds asm.js-style setjmp/longjmp handling support for WebAssembly. It also uses JavaScript's try and catch mechanism. Reviewers: jpp, dschuff Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D24121 llvm-svn: 280415
* GlobalISel: add a G_PHI instruction to give phis a type.Tim Northover2016-09-013-2/+10
| | | | | | | They're another source of generic vregs, which are going to need a type on the definition when we remove the register width from MachineRegisterInfo. llvm-svn: 280412
* Fix the ASan fuse-lld.cc test after LLD r280012Reid Kleckner2016-09-011-1/+1
| | | | | | | | | | | With that change, images built with 'lld-link /debug' always have a debug directory. If no PDB filename was passed on the command line, then the filename in the executable is empty. PDB information would never work anyway if the PDB file name is empty, so go ahead and try DWARF in that case. llvm-svn: 280410
* [LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI)Matthew Simpson2016-09-011-22/+9
| | | | | | | | | We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer have to create temporary vectors with insertelement instructions when handling pointer induction variables. This case was mistakenly missed from r279649 when refactoring the other scalarization code. llvm-svn: 280405
* [X86] Loosen memory folding requirements for cvtdq2pd and cvtps2pd instructions.Andrey Turetskiy2016-09-011-2/+2
| | | | | | | | | According to spec cvtdq2pd and cvtps2pd instructions don't require memory operand to be aligned to 16 bytes. This patch removes this requirement from the memory folding table. Differential Revision: https://reviews.llvm.org/D23919 llvm-svn: 280402
* AMDGPU: Add runtime metadata for pointee alignment of argument.Yaxun Liu2016-09-012-1/+8
| | | | | | | | Add runtime metdata for pointee alignment of pointer type kernel argument. The key is KeyArgPointeeAlign and the value is a 32 bit unsigned integer. Differential Revision: https://reviews.llvm.org/D24145 llvm-svn: 280399
* [lib/LTO] Simplify a bit. NFCI.Davide Italiano2016-09-011-4/+1
| | | | llvm-svn: 280396
* Rename some variables to have meaningful names. NFC.Michael Kuperstein2016-09-011-29/+29
| | | | llvm-svn: 280391
* [LV] Move VectorParts allocation and mapping into PHI widening (NFC)Matthew Simpson2016-09-011-29/+38
| | | | | | | | | | | | | | | | This patch moves the allocation of VectorParts for PHI nodes into the actual PHI widening code. Previously, we allocated these VectorParts in vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon returning, we would then map the VectorParts in VectorLoopValueMap. This behavior is problematic for the cases where we only want to generate a scalar version of a PHI node. For example, if in the future we only generate a scalar version of an induction variable, we would end up inserting an empty vector entry into the map once we return to vectorizeBlockInLoop. We now no longer need to pass VectorParts to the various PHI widening functions, and we can keep VectorParts allocation as close as possible to the point at which they are actually mapped in VectorLoopValueMap. llvm-svn: 280390
* [codeview] Properly propagate the TypeLeafKind through the pipeline.Zachary Turner2016-09-011-3/+3
| | | | llvm-svn: 280388
* [DAGCombine] Don't fold a trunc if it feeds an anyextMichael Kuperstein2016-09-011-0/+4
| | | | | | | | | | | | | | | Legalization tends to create anyext(trunc) patterns. This should always be combined - into either a single trunc, a single ext, or nothing if the types match exactly. But if we happen to combine the trunc first, we may pull the trunc away from the anyext or make it implicit (e.g. the truncate(extract) -> extract(bitcast) fold). To prevent this, we can avoid doing the fold, similarly to how we already handle fpround(fpextend). Differential Revision: https://reviews.llvm.org/D23893 llvm-svn: 280386
* AMDGPU/SI: MIMG TD Refactoring.Changpeng Fang2016-09-013-744/+751
| | | | | | | | | | | | | | Summary: Created a new td file MIMGInstructions.td which contains all definitions of MIMG related instructions. Reviewed by: kzhuravl, vpykhtin Differential Revision: http://reviews.llvm.org/D24106 llvm-svn: 280385
* [EarlyCSE] Change C API pass interface for EarlyCSE w/ MemorySSAGeoff Berry2016-09-011-2/+6
| | | | | | | | | | Previous change broke the C API for creating an EarlyCSE pass w/ MemorySSA by adding a bool parameter to control whether MemorySSA was used or not. This broke the OCaml bindings. Instead, change the old C API entry point back and add a new one to request an EarlyCSE pass with MemorySSA. llvm-svn: 280379
* [mips] Include missed file from previous commitSimon Dardis2016-09-011-0/+1044
| | | | llvm-svn: 280377
* [X86][SSE] Dropped (V)CVTPD2PS intrinsic patterns now that its bound to ↵Simon Pilgrim2016-09-011-11/+4
| | | | | | | | | | X86vfpround It now uses X86vfpround patterns directly instead. Followup to D23797 llvm-svn: 280376
* [mips] interAptiv based generic schedule modelSimon Dardis2016-09-013-3/+8
| | | | | | | | | | | This scheduler describes a processor which covers all MIPS ISAs based around the interAptiv and P5600 timings. Reviewers: vkalintiris, dsanders Differential Revision: https://reviews.llvm.org/D23551 llvm-svn: 280374
* [InstCombine] remove fold of an icmp pattern that should never happenSanjay Patel2016-09-011-15/+0
| | | | | | | | | | | | While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger this code path. The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic() or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast(). Differential Revision: https://reviews.llvm.org/D24031 llvm-svn: 280370
* [Hexagon] Deal with undefs when extending live intervalsKrzysztof Parzyszek2016-09-011-1/+1
| | | | | | Reapply r280275, since MSVC accepts r280358. llvm-svn: 280369
* Optimized FMA intrinsic + FNEG , likeElena Demikhovsky2016-09-011-21/+86
| | | | | | | | | | | | | -(a*b+c) and FNEG + FMA, like a*b-c or (-a)*b+c. The bug description is here : https://llvm.org/bugs/show_bug.cgi?id=28892 Differential revision: https://reviews.llvm.org/D23313 llvm-svn: 280368
* [SimplifyCFG] Handle tail-sinking of more than 2 incoming branchesJames Molloy2016-09-011-28/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] | | \ | | \ | [x(2)] | \ | / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] | | \ | | \ | [x(2)] | \ / | [sink.split] | \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280364
* Add an optional parameter with a list of undefs to extendToIndicesKrzysztof Parzyszek2016-09-011-2/+3
| | | | | | Reapply r280268, hopefully in a version that MSVC likes. llvm-svn: 280358
* [IR] Properly handle escape characters in Attribute::getAsString()Honggyu Kim2016-09-012-7/+13
| | | | | | | | | | | | | | | | | | | | | If an attribute name has special characters such as '\01', it is not properly printed in LLVM assembly language format. Since the format expects the special characters are printed as it is, it has to contain escape characters to make it printable. Before: attributes #0 = { ... "counting-function"="^A__gnu_mcount_nc" ... After: attributes #0 = { ... "counting-function"="\01__gnu_mcount_nc" ... Reviewers: hfinkel, rengolin, rjmccall, compnerd Subscribers: nemanjai, mcrosier, hans, shenhan, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D23792 llvm-svn: 280357
* [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEndJames Molloy2016-09-011-90/+149
| | | | | | | | | | | | | | | | | | | | | | | | | r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280351
* Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPCHal Finkel2016-09-017-32/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 llvm-svn: 280350
* [AMDGPU] Scalar Memory instructions TD refactoringValery Pykhtin2016-09-017-398/+431
| | | | | | Differential revision: https://reviews.llvm.org/D23996 llvm-svn: 280349
* Add a counter-function insertion passHal Finkel2016-09-014-0/+67
| | | | | | | | | | | | | | | | | | As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong. Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so we're passing it here as a function attribute. Clang will be updated to use this mechanism. Differential Revision: https://reviews.llvm.org/D22825 llvm-svn: 280347
* [Support] Fix a warning introduced in r280339 due to the memberChandler Carruth2016-09-011-1/+1
| | | | | | | | | | | | initializers not being in the same order as the members. Specifically, 'preg' is the first member followed by 'error', so they will be initialized in that order and should be written in the member initializer list in that order. For the constructor in question, there is no change in behavior. llvm-svn: 280345
* [SimplifyCFG] Fix nondeterministic iteration orderJames Molloy2016-09-011-2/+2
| | | | | | | | We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet. Should fix stage3 convergence builds. llvm-svn: 280342
* [LLVM/Support] - Create no-arguments constructor for llvm::RegexGeorge Rimar2016-09-011-0/+2
| | | | | | | | | This is useful when need to defer the construction, e.g. using Regex as a member of class. Differential revision: https://reviews.llvm.org/D24101 llvm-svn: 280339
* [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more casesJames Molloy2016-09-011-6/+21
| | | | | | | | | | | | | | | | | | | A very important case is not handled here: multiple arcs to a single block with a PHI. Consider: a: %1 = icmp %b, 1 br %1, label %c, label %e c: %2 = icmp %b, 2 br %2, label %d, label %e d: br %e e: phi [0, %a], [1, %c], [2, %d] FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs. llvm-svn: 280338
* [NFC] Remove unnecessary commentDean Michael Berris2016-09-011-4/+2
| | | | llvm-svn: 280336
* [XRay] Detect and emit sleds for sibling/tail callsDean Michael Berris2016-09-012-3/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change promotes the 'isTailCall(...)' member function to TargetInstrInfo as a query interface for determining on a per-target basis whether a given MachineInstr is a tail call instruction. We build upon this in the XRay instrumentation pass to emit special sleds for tail call optimisations, where we emit the correct kind of sled. The tail call sleds look like a mix between the function entry and function exit sleds. Form-wise, the sled comes before the "jmp" instruction that implements the tail call similar to how we do it for the function entry sled. Functionally, because we know this is a tail call, it behaves much like an exit sled -- i.e. at runtime we may use the exit trampolines instead of a different kind of trampoline. A follow-up change to recognise these sleds will be done in compiler-rt, so that we can start intercepting these initially as exits, but also have the option to have different log entries to more accurately reflect that this is actually a tail call. Reviewers: echristo, rSerge, majnemer Subscribers: mehdi_amini, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D23986 llvm-svn: 280334
* [libFuzzer] add -minimize_crash flag (to minimize crashers). also add two ↵Kostya Serebryany2016-09-015-8/+115
| | | | | | tests that I failed to commit last time llvm-svn: 280332
* [XRay][NFC] Promote isTailCall() as virtual in TargetInstrInfo.Dean Michael Berris2016-09-013-1/+29
| | | | | | | This change is broken out from D23986, where XRay detects tail call exits. llvm-svn: 280331
* Revert "Add asm.js-style setjmp/longjmp handling for wasm"Heejin Ahn2016-09-012-771/+165
| | | | | | This reverts commit r280302, it broke the integration tests. llvm-svn: 280329
* Add cast to appease windows builder. Fixes build break introduced in r280306.Nick Lewycky2016-08-311-1/+1
| | | | llvm-svn: 280311
* [codeview] Have visitTypeBegin return the record type.Zachary Turner2016-08-314-12/+20
| | | | | | | | | | | | | | | | | | | Previously we were assuming that any visitation of types would necessarily be against a type we had binary data for. Reasonable assumption when were just reading PDBs and dumping them, but once we start writing PDBs from Yaml this breaks down, because we have no binary data yet, only Yaml, and from that we need to read the record kind and perform the switch based on that. So this patch does that. Instead of having the visitor switch on the kind that is already in the CVType record, we change the visitTypeBegin() method to return the Kind, and switch on the returned value. This way, the default implementation can still return the value from the CVType, but the implementation which visits Yaml records and serializes binary PDB type records can use the field in the Yaml as the source of the switch. llvm-svn: 280307
* Add -fprofile-dir= to clang.Nick Lewycky2016-08-311-13/+30
| | | | | | | | | | | | | | | | | | | | | -fprofile-dir=path allows the user to specify where .gcda files should be emitted when the program is run. In particular, this is the first flag that causes the .gcno and .o files to have different paths, LLVM is extended to support this. -fprofile-dir= does not change the file name in the .gcno (and thus where lcov looks for the source) but it does change the name in the .gcda (and thus where the runtime library writes the .gcda file). It's different from a GCOV_PREFIX because a user can observe that the GCOV_PREFIX_STRIP will strip paths off of -fprofile-dir= but not off of a supplied GCOV_PREFIX. To implement this we split -coverage-file into -coverage-data-file and -coverage-notes-file to specify the two different names. The !llvm.gcov metadata node grows from a 2-element form {string coverage-file, node dbg.cu} to 3-elements, {string coverage-notes-file, string coverage-data-file, node dbg.cu}. In the 3-element form, the file name is already "mangled" with .gcno/.gcda suffixes, while the 2-element form left that to the middle end pass. llvm-svn: 280306
* Add asm.js-style setjmp/longjmp handling for wasmHeejin Ahn2016-08-313-166/+772
| | | | | | | | | | | | Summary: This patch adds asm.js-style setjmp/longjmp handling support for WebAssembly. It also uses JavaScript's try and catch mechanism. Reviewers: jpp, dschuff Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D23928 llvm-svn: 280302
* Revert "Add an optional parameter with a list of undefs to extendToIndices"Reid Kleckner2016-08-312-4/+3
| | | | | | | | | | | | This reverts commit r280268, it causes all MSVC 2013 to ICE. This appears to have been fixed in a later MSVC 2013 update, because I cannot reproduce it locally. That said, all upstream LLVM bots are broken right now, so I am reverting. Also reverts dependent change r280275, "[Hexagon] Deal with undefs when extending live intervals". llvm-svn: 280301
* [InstCombine] allow icmp (shr exact X, C2), C fold for splat constant vectorsSanjay Patel2016-08-311-5/+0
| | | | | | | The enhancement to foldICmpDivConstant ( http://llvm.org/viewvc/llvm-project?view=revision&revision=280299 ) allows us to remove the ConstantInt check; no other changes needed. llvm-svn: 280300
* [InstCombine] allow icmp (div X, Y), C folds for splat constant vectorsSanjay Patel2016-08-311-37/+26
| | | | | | Converting all of the overflow ops to APInt looked risky, so I've left that as a TODO. llvm-svn: 280299
* AMDGPU: Fix introducing stack access on unaligned v16i8Matt Arsenault2016-08-311-1/+8
| | | | llvm-svn: 280298
* AMDGPU: Use copy instead of mov during frame loweringMatt Arsenault2016-08-311-15/+6
| | | | | | | This occurs before RA pseudos are expanded. It's less code to emit the copy. llvm-svn: 280297
* AMDGPU: Refactor frame loweringMatt Arsenault2016-08-312-121/+181
| | | | | | This will make future changes easier. llvm-svn: 280296
* [codeview] Add TypeVisitorCallbackPipeline.Zachary Turner2016-08-313-12/+42
| | | | | | | | | | | | | | | | | We were kind of hacking this together before by embedding the ability to forward requests into the TypeDeserializer. When we want to start adding more different kinds of visitor callback interfaces though, this doesn't scale well and is very inflexible. So introduce the notion of a pipeline, which itself implements the TypeVisitorCallbacks interface, but which contains an internal list of other callbacks to invoke in sequence. Also update the existing uses of CVTypeVisitor to use this new pipeline class for deserializing records before visiting them with another visitor. llvm-svn: 280293
OpenPOWER on IntegriCloud