summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Add constrained intrinsics for some libm-equivalent operationsAndrew Kaylor2017-05-251-9/+62
| | | | | | Differential revision: https://reviews.llvm.org/D32319 llvm-svn: 303922
* Fix SelectionDAGBuilder::getDbgValue to not expect DW_OP_deref on FI varsAdrian Prantl2017-05-251-14/+5
| | | | | | | | | | | | This fixes an oversight in r300522, which changed alloca dbg.values to no longer emit a DW_OP_deref. The array.ll testcase was regenerated from source. Fixes PR33166: https://bugs.llvm.org/show_bug.cgi?id=33166 llvm-svn: 303897
* Revert LLVM changes for "Sema: allow imaginary constants via GNU extension ↵Tim Northover2017-05-231-4/+1
| | | | | | | | if UDL overloads not present." The changes accidentally crept into a Clang commit I was making. llvm-svn: 303697
* Sema: allow imaginary constants via GNU extension if UDL overloads not present.Tim Northover2017-05-231-1/+4
| | | | | | | | | | | | | C++14 added user-defined literal support for complex numbers so that you can write something like "complex<double> val = 2i". However, there is an existing GNU extension supporting this syntax and interpreting the result as a _Complex type. This changes parsing so that such literals are interpreted in terms of C++14's operators if an overload is present but otherwise falls back to the original GNU extension. llvm-svn: 303694
* IR: Give function GlobalValue::getRealLinkageName() a less misleading name: ↵Peter Collingbourne2017-05-161-2/+2
| | | | | | | | | | | | dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134
* [KnownBits] Add bit counting methods to KnownBits struct and use them where ↵Craig Topper2017-05-121-1/+1
| | | | | | | | | | | | possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
* [CodeGen] Don't require AA in SDAGISel at -O0.Ahmed Bougacha2017-05-101-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Before r247167, the pass manager builder controlled which AA implementations were used, exporting them all in the AliasAnalysis analysis group. Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA implementations if made available in the pass pipeline. But regardless, SDAGISel is required at O0, and really doesn't need to be doing fancy optimizations based on useful AA results. Don't require AA at CodeGenOpt::None, and only use it otherwise. This does have a functional impact (and one testcase is pessimized because we can't reuse a load). But I think that's desirable no matter what. Note that this alone doesn't result in less DT computations: TwoAddress was previously able to reuse the DT we computed for SDAG. That will be fixed separately. Differential Revision: https://reviews.llvm.org/D32766 llvm-svn: 302611
* Re-land "Use the frame index side table for byval and inalloca arguments"Reid Kleckner2017-05-091-1/+9
| | | | | | This re-lands r302483. It was not the cause of PR32977. llvm-svn: 302544
* Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare"Reid Kleckner2017-05-091-14/+0
| | | | | | This re-lands commit r302461. It was not the cause of PR32977. llvm-svn: 302543
* Add extra operand to CALLSEQ_START to keep frame part set up previouslySerge Pavlov2017-05-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527
* Introduce experimental generic intrinsics for horizontal vector reductions.Amara Emerson2017-05-091-0/+88
| | | | | | | | | | | | | | - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514
* Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare"Reid Kleckner2017-05-091-0/+14
| | | | | | | | | | This reverts commit r302461. It appears to be causing failures compiling gtest with debug info on the Linux sanitizer bot. I was unable to reproduce the failure locally, however. llvm-svn: 302504
* Revert "Use the frame index side table for byval and inalloca arguments"Reid Kleckner2017-05-091-9/+1
| | | | | | This reverts r302483 and it's follow up fix. llvm-svn: 302493
* Use the frame index side table for byval and inalloca argumentsReid Kleckner2017-05-081-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For inalloca functions, this is a very common code pattern: %argpack = type <{ i32, i32, i32 }> define void @f(%argpack* inalloca %args) { entry: %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0 %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1 %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2 tail call void @llvm.dbg.declare(metadata i32* %a, ... "a") tail call void @llvm.dbg.declare(metadata i32* %c, ... "b") tail call void @llvm.dbg.declare(metadata i32* %b, ... "c") Even though these GEPs can be simplified to a constant offset from EBP or RSP, we don't do that at -O0, and each GEP is computed into a register. Registers used to compute argument addresses are typically spilled and clobbered very quickly after the initial computation, so live debug variable tracking loses information very quickly if we use DBG_VALUE instructions. This change moves processing of dbg.declare between argument lowering and basic block isel, so that we can ask if an argument has a frame index or not. If the argument lives in a register as is the case for byval arguments on some targets, then we don't put it in the side table and during ISel we emit DBG_VALUE instructions. Reviewers: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32980 llvm-svn: 302483
* Don't add DBG_VALUE instructions for static allocas in dbg.declareReid Kleckner2017-05-081-14/+0
| | | | | | | | | | | | | | | | | | | | | | | Summary: An llvm.dbg.declare of a static alloca is always added to the MachineFunction dbg variable map, so these values are entirely redundant. They survive all the way through codegen to be ignored by DWARF emission. Effectively revert r113967 Two bugpoint-reduced test cases from 2012 broke as a result of this change. Despite my best efforts, I haven't been able to rewrite the test case using dbg.value. I'm not too concerned about the lost coverage because these were reduced from the test-suite, which we still run. Reviewers: aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32920 llvm-svn: 302461
* [XRay] Custom event logging intrinsicDean Michael Berris2017-05-081-0/+30
| | | | | | | | | | | | | | | | | | | | This patch introduces an LLVM intrinsic and a target opcode for custom event logging in XRay. Initially, its use case will be to allow users of XRay to log some type of string ("poor man's printf"). The target opcode compiles to a noop sled large enough to enable calling through to a runtime-determined relative function call. At runtime, when X-Ray is enabled, the sled is replaced by compiler-rt with a trampoline to the logic for creating the custom log entries. Future patches will implement the compiler-rt parts and clang-side support for emitting the IR corresponding to this intrinsic. Reviewers: timshen, dberris Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D27503 llvm-svn: 302405
* Simplify dbg.value handling in SDISel with early returnsReid Kleckner2017-05-051-35/+23
| | | | | | | | | | | | | | No functional change other than improving dbgs logging accuracy on constant dbg values. Previously we would add things like "i32 42" as debug values, and then log that we were dropping the debug info, which is silly. Delete some dead code that was checking for static allocas. This remained after r207165, but served no purpose. Currently, static alloca dbg.values are always sent through the DanglingDebugInfoMap, and are usually made valid the first time the alloca is used. llvm-svn: 302267
* [SelectionDAG] Improve support for promotion of <1 x fX> floating point ↵Simon Pilgrim2017-05-021-3/+3
| | | | | | | | | | | | argument types (PR31088) PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats. This patch adds support for extension/truncation for both integer and float types. Differential Revision: https://reviews.llvm.org/D32391 llvm-svn: 301910
* Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.Amara Emerson2017-05-011-10/+10
| | | | | | | | This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803
* Use Argument::hasAttribute and AttributeList::ReturnIndex moreReid Kleckner2017-04-281-22/+18
| | | | | | | | | | | This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666
* [InlineCost] Improve the cost heuristic for SwitchJun Bum Lim2017-04-281-98/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The motivation example is like below which has 13 cases but only 2 distinct targets ``` lor.lhs.false2: ; preds = %if.then switch i32 %Status, label %if.then27 [ i32 -7012, label %if.end35 i32 -10008, label %if.end35 i32 -10016, label %if.end35 i32 15000, label %if.end35 i32 14013, label %if.end35 i32 10114, label %if.end35 i32 10107, label %if.end35 i32 10105, label %if.end35 i32 10013, label %if.end35 i32 10011, label %if.end35 i32 7008, label %if.end35 i32 7007, label %if.end35 i32 5002, label %if.end35 ] ``` which is compiled into a balanced binary tree like this on AArch64 (similar on X86) ``` .LBB853_9: // %lor.lhs.false2 mov w8, #10012 cmp w19, w8 b.gt .LBB853_14 // BB#10: // %lor.lhs.false2 mov w8, #5001 cmp w19, w8 b.gt .LBB853_18 // BB#11: // %lor.lhs.false2 mov w8, #-10016 cmp w19, w8 b.eq .LBB853_23 // BB#12: // %lor.lhs.false2 mov w8, #-10008 cmp w19, w8 b.eq .LBB853_23 // BB#13: // %lor.lhs.false2 mov w8, #-7012 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_14: // %lor.lhs.false2 mov w8, #14012 cmp w19, w8 b.gt .LBB853_21 // BB#15: // %lor.lhs.false2 mov w8, #-10105 add w8, w19, w8 cmp w8, #9 // =9 b.hi .LBB853_17 // BB#16: // %lor.lhs.false2 orr w9, wzr, #0x1 lsl w8, w9, w8 mov w9, #517 and w8, w8, w9 cbnz w8, .LBB853_23 .LBB853_17: // %lor.lhs.false2 mov w8, #10013 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_18: // %lor.lhs.false2 mov w8, #-7007 add w8, w19, w8 cmp w8, #2 // =2 b.lo .LBB853_23 // BB#19: // %lor.lhs.false2 mov w8, #5002 cmp w19, w8 b.eq .LBB853_23 // BB#20: // %lor.lhs.false2 mov w8, #10011 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_21: // %lor.lhs.false2 mov w8, #14013 cmp w19, w8 b.eq .LBB853_23 // BB#22: // %lor.lhs.false2 mov w8, #15000 cmp w19, w8 b.ne .LBB853_3 ``` However, the inline cost model estimates the cost to be linear with the number of distinct targets and the cost of the above switch is just 2 InstrCosts. The function containing this switch is then inlined about 900 times. This change use the general way of switch lowering for the inline heuristic. It etimate the number of case clusters with the suitability check for a jump table or bit test. Considering the binary search tree built for the clusters, this change modifies the model to be linear with the size of the balanced binary tree. The model is off by default for now : -inline-generic-switch-cost=false This change was originally proposed by Haicheng in D29870. Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier Reviewed By: hans Subscribers: joerg, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D31085 llvm-svn: 301649
* [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and ↵Craig Topper2017-04-281-1/+1
| | | | | | | | | | | | simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620
* [SelectionDAG] Use getBuildVector helper where possible. NFCISimon Pilgrim2017-04-251-7/+6
| | | | llvm-svn: 301314
* [SelectionDAG] Pull out repeated getValueType calls. NFCI.Simon Pilgrim2017-04-251-6/+6
| | | | | | Noticed in D32391. llvm-svn: 301308
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-8/+8
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301234
* Revert r301231: Accidentally committed stale filesKrzysztof Parzyszek2017-04-241-7/+7
| | | | | | I forgot to commit local changes before commit. llvm-svn: 301232
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-7/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301231
* CodeGen: Add a hook for getFenceOperandTyYaxun Liu2017-04-241-2/+2
| | | | | | | | | | | | | | Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0. This is fine for most targets. However for amdgcn target, the size of pointer in address space 0 depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is 32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target triple environment. Therefore a hook is need in target lowering for getting the fence operand type. This patch has no effect on targets other than amdgcn. Differential Revision: https://reviews.llvm.org/D32186 llvm-svn: 301215
* CodeGen: Let frame index value type match alloca addr spaceYaxun Liu2017-04-201-5/+5
| | | | | | | | | | | | | | | | | | | | | | Recently alloca address space has been added to data layout. Due to this change, pointer returned by alloca may have different size as pointer in address space 0. However, currently the value type of frame index is assumed to be of the same size as pointer in address space 0. This patch fixes that. Most targets assume alloca returning pointer in address space 0, which is the default alloca address space. Therefore it is NFC for them. AMDGCN target with amdgiz environment requires this change since it assumes alloca returning pointer to addr space 5 and its size is 32, which is different from the size of pointer in addr space 0 which is 64. Differential Revision: https://reviews.llvm.org/D32021 llvm-svn: 300864
* PR32382: Fix emitting complex DWARF expressions.Adrian Prantl2017-04-181-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DWARF specification knows 3 kinds of non-empty simple location descriptions: 1. Register location descriptions - describe a variable in a register - consist of only a DW_OP_reg 2. Memory location descriptions - describe the address of a variable 3. Implicit location descriptions - describe the value of a variable - end with DW_OP_stack_value & friends The existing DwarfExpression code is pretty much ignorant of these restrictions. This used to not matter because we only emitted very short expressions that we happened to get right by accident. This patch makes DwarfExpression aware of the rules defined by the DWARF standard and now chooses the right kind of location description for each expression being emitted. This would have been an NFC commit (for the existing testsuite) if not for the way that clang describes captured block variables. Based on how the previous code in LLVM emitted locations, DW_OP_deref operations that should have come at the end of the expression are put at its beginning. Fixing this means changing the semantics of DIExpression, so this patch bumps the version number of DIExpression and implements a bitcode upgrade. There are two major changes in this patch: I had to fix the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the *address* of a variable as the first argument, even if the argument is not an alloca. When lowering a DBG_VALUE, the decision of whether to emit a register location description or a memory location description depends on the MachineLocation — register machine locations may get promoted to memory locations based on their DIExpression. (Future) optimization passes that want to salvage implicit debug location for variables may do so by appending a DW_OP_stack_value. For example: DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8 DBG_VALUE, RAX --> DW_OP_reg0 +0 DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0 All testcases that were modified were regenerated from clang. I also added source-based testcases for each of these to the debuginfo-tests repository over the last week to make sure that no synchronized bugs slip in. The debuginfo-tests compile from source and run the debugger. https://bugs.llvm.org/show_bug.cgi?id=32382 <rdar://problem/31205000> Differential Revision: https://reviews.llvm.org/D31439 llvm-svn: 300522
* [IR] Make paramHasAttr to use arg indices instead of attr indicesReid Kleckner2017-04-141-4/+3
| | | | | | | | | This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367
* Revert "[SelectionDAG] Enable target specific vector scalarization of calls ↵Simon Dardis2017-04-071-172/+60
| | | | | | | | | | | | | and returns" This reverts commit r299766. This change appears to have broken the MIPS buildbots. Reverting while I investigate. Revert "[mips] Remove usage of debug only variable (NFC)" This reverts commit r299769. Follow up commit. llvm-svn: 299788
* [SelectionDAG] Enable target specific vector scalarization of calls and returnsSimon Dardis2017-04-071-60/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 299766
* [SDAG] Remove -enable-fmf-dagAdam Nemet2017-03-281-12/+7
| | | | | | | This is no longer needed as spotted by Sanjay in https://reviews.llvm.org/D31165. llvm-svn: 298963
* [SDAG] Add AllowContract to SNodeFlagsAdam Nemet2017-03-281-0/+1
| | | | | | | | | | | Properly propagate the FMF from the LLVM IR to this flag. This is toward moving fp-contraction=fast from an LLVM TargetOption to a FastMathFlag in order to fix PR25721. Differential Revision: https://reviews.llvm.org/D31165 llvm-svn: 298961
* [x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equalitySanjay Patel2017-03-281-8/+8
| | | | | | | Follow-up to: https://reviews.llvm.org/rL298775 llvm-svn: 298933
* [x86] use PMOVMSK to replace memcmp libcalls for 16-byte equalitySanjay Patel2017-03-251-33/+44
| | | | | | | | | This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp calls with inline code that is both smaller and faster. Differential Revision: https://reviews.llvm.org/D31290 llvm-svn: 298775
* Rename AttributeSet to AttributeListReid Kleckner2017-03-211-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393
* Make library calls sensitive to regparm module flag (Fixes PR3997).Nirav Dave2017-03-181-2/+2
| | | | | | | | | | Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179
* Capitalize ArgListEntry fields. NFC.Nirav Dave2017-03-181-39/+37
| | | | llvm-svn: 298178
* Remove getArgumentList() in favor of arg_begin(), args(), etcReid Kleckner2017-03-161-1/+1
| | | | | | | | | | | | | | | | | Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand, is constant time. In general, the fact that arguments are stored in an iplist is an implementation detail, so I've removed it from the Function interface and moved all other users to the argument container APIs (arg_begin(), arg_end(), args(), arg_size()). Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D31052 llvm-svn: 298010
* [DAG] fix typo in comment; NFCSanjay Patel2017-03-061-1/+1
| | | | llvm-svn: 297011
* [DAG] early exit to improve readability and formatting of visitMemCmpCall(); ↵Sanjay Patel2017-03-021-64/+53
| | | | | | NFCI llvm-svn: 296824
* [DAG] improve documentation comments; NFCSanjay Patel2017-03-021-47/+14
| | | | llvm-svn: 296808
* fix typo in comment; NFCSanjay Patel2017-03-021-1/+1
| | | | llvm-svn: 296760
* Elide argument copies during instruction selectionReid Kleckner2017-03-011-5/+211
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Avoids tons of prologue boilerplate when arguments are passed in memory and left in memory. This can happen in a debug build or in a release build when an argument alloca is escaped. This will dramatically affect the code size of x86 debug builds, because X86 fast isel doesn't handle arguments passed in memory at all. It only handles the x86_64 case of up to 6 basic register parameters. This is implemented by analyzing the entry block before ISel to identify copy elision candidates. A copy elision candidate is an argument that is used to fully initialize an alloca before any other possibly escaping uses of that alloca. If an argument is a copy elision candidate, we set a flag on the InputArg. If the the target generates loads from a fixed stack object that matches the size and alignment requirements of the alloca, the SelectionDAG builder will delete the stack object created for the alloca and replace it with the fixed stack object. The load is left behind to satisfy any remaining uses of the argument value. The store is now dead and is therefore elided. The fixed stack object is also marked as mutable, as it may now be modified by the user, and it would be invalid to rematerialize the initial load from it. Supersedes D28388 Fixes PR26328 Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29668 llvm-svn: 296683
* [SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where ↵Craig Topper2017-02-151-46/+24
| | | | | | | | | | | | | | | | | | | inputs are larger than the mask Summary: The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract. This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract. Reviewers: zvi, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29926 llvm-svn: 295152
* swiftcc: Don't emit tail calls from callers with swifterror parametersArnold Schwaighofer2017-02-131-0/+9
| | | | | | | | | Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982
* [SelectionDAG] Fix bugs in inverted condition splitting code.Geoff Berry2017-02-091-4/+6
| | | | | | | | | | | | | | | | | | Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization. Reviewers: bogner, MatzeB, qcolombet Subscribers: mcrosier, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D29680 llvm-svn: 294605
* [SDAGISel] Simplify some SDAGISel code, NFCReid Kleckner2017-02-071-23/+23
| | | | | | | | | | | | Hoist entry block code for arguments and swift error values out of the basic block instruction selection loop. Lowering arguments once up front seems much more readable than doing it conditionally inside the loop. It also makes it clear that argument lowering can update StaticAllocaMap because no instructions have been selected yet. Also use range-based for loops where possible. llvm-svn: 294329
OpenPOWER on IntegriCloud