summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Reinstate "Nuke the old JIT."Eric Christopher2014-09-021-3/+2
| | | | | | | | Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
* [ARM] Enable DP copy, load and store instructions for FPv4-SPOliver Stannard2014-08-211-11/+8
| | | | | | | | | | | | | | | | | The FPv4-SP floating-point unit is generally referred to as single-precision only, but it does have double-precision registers and load, store and GPR<->DPR move instructions which operate on them. This patch enables the use of these registers, the main advantage of which is that we now comply with the AAPCS-VFP calling convention. This partially reverts r209650, which added some AAPCS-VFP support, but did not handle return values or alignment of double arguments in registers. This patch also adds tests for Thumb2 code generation for floating-point instructions and intrinsics, which previously only existed for ARM. llvm-svn: 216172
* Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG ↵Jiangning Liu2014-08-211-26/+2
| | | | | | Builder and type". llvm-svn: 216147
* Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and typeJiangning Liu2014-08-201-2/+26
| | | | | | | | legalization stage. With those two optimizations, fewer signed/zero extension instructions can be inserted, and then we can expose more opportunities to Machine CSE pass in back-end. llvm-svn: 216066
* [PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsicHal Finkel2014-08-131-1/+1
| | | | | | | | | | | | | | | | | This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store intrinsics. As with the construction of the MachineMemOperands for the intrinsic calls used for unaligned load/store lowering, the only slight complication is that we need to represent a larger memory range than the loaded/stored value-type size (because the address is rounded down to an aligned address, and we need to conservatively represent the entire possible range of the actual access). This required adding an extra size field to TargetLowering::IntrinsicInfo, and this was done in a way that required no modifications to other targets (the size defaults to the store size of the provided memory data type). This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed). llvm-svn: 215512
* [pr19635] Revert most of r170537, and add new testcase.Patrik Hagglund2014-08-081-1/+1
| | | | | | | | Patch provided by Andrey Kuharev. Sorry, r170537 was obviously wrong. llvm-svn: 215190
* [stack protector] Look through bitcasts to get global variableAkira Hatanaka2014-08-071-9/+19
| | | | | | | | | | | | | | | | __stack_chk_guard. Handle the case where the pointer operand of the load instruction that loads the stack guard is not a global variable but instead a bitcast. %StackGuard = load i8** bitcast (i64** @__stack_chk_guard to i8**) call void @llvm.stackprotector(i8* %StackGuard, i8** %StackGuardSlot) Original test case provided by Ana Pazos. This fixes PR20558. llvm-svn: 215167
* Temporarily Revert "Nuke the old JIT." as it's not quite ready toEric Christopher2014-08-071-2/+3
| | | | | | | | | | | be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154
* Nuke the old JIT.Rafael Espindola2014-08-071-3/+2
| | | | | | | | | I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111
* Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher2014-08-051-3/+2
| | | | | | | | | | | shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-76/+103
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* Add alignment value to allowsUnalignedMemoryAccessMatt Arsenault2014-07-271-2/+3
| | | | | | | | | | Rename to allowsMisalignedMemoryAccess. On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment, and don't need to be split into multiple accesses. Vector loads with an alignment of the element type are not uncommon in OpenCL code. llvm-svn: 214055
* Add @llvm.assume, lowering, and some basic propertiesHal Finkel2014-07-251-1/+2
| | | | | | | | | | | | | | | | | This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here: - llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef). The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion. llvm-svn: 213973
* [stack protector] Fix a potential security bug in stack protector where theAkira Hatanaka2014-07-251-5/+42
| | | | | | | | | | | | | | address of the stack guard was being spilled to the stack. Previously the address of the stack guard would get spilled to the stack if it was impossible to keep it in a register. This patch introduces a new target independent node and pseudo instruction which gets expanded post-RA to a sequence of instructions that load the stack guard value. Register allocator can now just remat the value when it can't keep it in a register. <rdar://problem/12475629> llvm-svn: 213967
* AA metadata refactoring (introduce AAMDNodes)Hal Finkel2014-07-241-5/+9
| | | | | | | | | | | | | | | | | | | | In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode*). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode* in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859
* CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpextTim Northover2014-07-211-4/+7
| | | | | | | | | | | | | | | | | This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc, and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation path is now uniform, regardless of the input IR: fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall fpext -> FP16_TO_FP (if f16 illegal) -> libcall Each target should be able to select the version that best matches its operations and not be required to duplicate patterns for both fptrunc and FP_TO_FP16 (for example). As a result we can remove some redundant AArch64 patterns. llvm-svn: 213507
* CodeGen: extend f16 conversions to permit types > float.Tim Northover2014-07-171-3/+4
| | | | | | | | | | | | | | | | | | | This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248
* Remove TLI from isInTailCallPosition's arguments. NFC.Juergen Ributzka2014-07-161-1/+1
| | | | | | | There is no need to pass on TLI separately to the function. As Eric pointed out the Target Machine already provides everything we need. llvm-svn: 213108
* [FastISel] Make isInTailCallPosition independent of SelectionDAG.Juergen Ributzka2014-07-111-1/+1
| | | | | | | Break out the arguemnts required from SelectionDAG, so that this function can also be used by FastISel. llvm-svn: 212844
* Fix ppcf128 component access on little-endian systemsUlrich Weigand2014-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a pair of two doubles, where one is considered the "high" or more-significant part, and the other is considered the "low" or less-significant part. When a ppcf128 value is stored in memory or a register pair, the high part always comes first, i.e. at the lower memory address or in the lower-numbered register, and the low part always comes second. This is true both on big-endian and little-endian PowerPC systems. (Similar to how with a complex number, the real part always comes first and the imaginary part second, no matter the byte order of the system.) This was implemented incorrectly for little-endian systems in LLVM. This commit fixes three related issues: - When printing an immediate ppcf128 constant to assembler output in emitGlobalConstantFP, emit the high part first on both big- and little-endian systems. - When lowering a ppcf128 type to a pair of f64 types in SelectionDAG (which is used e.g. when generating code to load an argument into a register pair), use correct low/high part ordering on little-endian systems. - In a related issue, because lowering ppcf128 into a pair of f64 must operate differently from lowering an int128 into a pair of i64, bitcasts between ppcf128 and int128 must not be optimized away by the DAG combiner on little-endian systems, but must effect a word-swap. Reviewed by Hal Finkel. llvm-svn: 212274
* [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.Juergen Ributzka2014-07-011-3/+3
| | | | | | | | The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135
* DAG: move sret demotion into most basic LowerCallTo implementation.Tim Northover2014-06-181-119/+124
| | | | | | | | | | | | | | | | It looks like there are two versions of LowerCallTo here: the SelectionDAGBuilder one is designed to operate on LLVM IR, and the TargetLowering one in the case where everything is at DAG level. Previously, only the SelectionDAGBuilder variant could handle demoting an impossible return to sret semantics (before delegating to the TargetLowering version), but this functionality is also useful for certain libcalls (e.g. 128-bit operations on 32-bit x86). So this commit moves the sret handling down a level. rdar://problem/17242889 llvm-svn: 211155
* IR: add "cmpxchg weak" variant to support permitted failure.Tim Northover2014-06-131-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
* Have isInTailCallPosition take the DAG so that we can use theEric Christopher2014-06-101-1/+1
| | | | | | | version of TargetLowering/Machine from there on the way to avoiding TargetMachine in TargetLowering. llvm-svn: 210579
* [DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG.Andrea Di Biagio2014-06-091-4/+35
| | | | | | | | | | | | | This patch modifies SelectionDAGBuilder to construct SDNodes with associated NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator instructions. Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing nsw/nuw/exact flags during codegen. Patch by Marcello Maggioni. llvm-svn: 210467
* SelectionDAG: skip barriers for unordered atomic operationsTim Northover2014-05-301-2/+2
| | | | | | | | | Unordered is strictly weaker than monotonic, so if the latter doesn't have any barriers then the former certainly shouldn't. rdar://problem/16548260 llvm-svn: 209901
* ARM: teach AAPCS-VFP to deal with Cortex-M4.Tim Northover2014-05-271-8/+11
| | | | | | | | | | | Cortex-M4 only has single-precision floating point support, so any LLVM "double" type will have been split into 2 i32s by now. Fortunately, the consecutive-register framework turns out to be precisely what's needed to reconstruct the double and follow AAPCS-VFP correctly! rdar://problem/17012966 llvm-svn: 209650
* Target: remove old constructors for CallLoweringInfoSaleem Abdulrasool2014-05-171-15/+15
| | | | | | | | | | This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082
* Target: change member from reference to pointerSaleem Abdulrasool2014-05-171-1/+1
| | | | | | | | | This is a preliminary step to help ease the construction of CallLoweringInfo. Changing the construction to a chained function pattern requires that the parameter be nullable. However, rather than copying the vector, save a pointer rather than the reference to permit a late binding of the arguments. llvm-svn: 209080
* ARM: HFAs must be passed in consecutive registersOliver Stannard2014-05-091-2/+22
| | | | | | | | | | When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must be passed in a block of consecutive floating-point registers, or on the stack. This means that unused floating-point registers cannot be back-filled with part of an HFA, however this can currently happen. This patch, along with the corresponding clang patch (http://reviews.llvm.org/D3083) prevents this. llvm-svn: 208413
* Implememting named register intrinsicsRenato Golin2014-05-061-0/+16
| | | | | | | | | | | This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104
* Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I ↵Craig Topper2014-04-301-6/+5
| | | | | | introduced most of these recently. llvm-svn: 207616
* Convert SelectionDAG::getMergeValues to use ArrayRef.Craig Topper2014-04-271-6/+5
| | | | llvm-svn: 207374
* Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer ↵Craig Topper2014-04-261-4/+2
| | | | | | and size. llvm-svn: 207329
* Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.Craig Topper2014-04-261-66/+44
| | | | llvm-svn: 207327
* This reapplies r207235 with an additional bugfixes caught by the msanAdrian Prantl2014-04-251-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | buildbot - do not insert debug intrinsics before phi nodes. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207269
* Revert "This reapplies r207130 with an additional testcase+and a missing ↵Adrian Prantl2014-04-251-19/+20
| | | | | | | | check for" This reverts commit 207235 to investigate msan buildbot breakage. llvm-svn: 207250
* This reapplies r207130 with an additional testcase+and a missing check forAdrian Prantl2014-04-251-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207235
* Revert "This reapplies r207130 with an additional testcase+and a missing ↵Adrian Prantl2014-04-251-19/+20
| | | | | | | | check for" Typo in testcase. llvm-svn: 207166
* This reapplies r207130 with an additional testcase+and a missing check forAdrian Prantl2014-04-251-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207165
* Revert "Debug info for optimized code: Support variables that are on the ↵Adrian Prantl2014-04-251-14/+13
| | | | | | | | stack and" This reverts commit 207130 for buildbot breakage. llvm-svn: 207162
* Debug info for optimized code: Support variables that are on the stack andAdrian Prantl2014-04-241-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine-intrinsics testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207130
* [Modules] Remove potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837
* Fix unnecessary line breakMatt Arsenault2014-04-211-4/+2
| | | | llvm-svn: 206772
* Patch by Vadim ChugunovYaron Keren2014-04-191-0/+5
| | | | | | | | | | | Win64 stack unwinder gets confused when execution flow "falls through" after a call to 'noreturn' function. This fixes the "missing epilogue" problem by emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows. A secondary use for it would be for anyone wanting to make double-sure that 'noreturn' functions, indeed, do not return. llvm-svn: 206684
* Convert SelectionDAG::getVTList to use ArrayRefCraig Topper2014-04-161-10/+10
| | | | llvm-svn: 206357
* Break PseudoSourceValue out of the Value hierarchy. It is now the root of ↵Nick Lewycky2014-04-151-5/+13
| | | | | | its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead. llvm-svn: 206255
* [C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵Craig Topper2014-04-141-117/+117
| | | | | | instead of comparing to nullptr. llvm-svn: 206142
* Fix for PR 19261:Eric Christopher2014-04-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | llc doesn't generate nodes for unconditional fall-through branches for targets without FastISel implementation (X86 has it, but can be disabled by "-fast-isel=false") in SelectionDAGBuilder::visitBr(). So for line 4 in the following testcase 1: void foo(int i){ 2: switch(i){ 3: default: 4: break; 5: } 6: return; 7: } there is no corresponding line in .debug_line section, and a debugger cannot set a breakpoint at line 4. Fix this by always emitting a branch when we're not optimizing and add a testcase to ensure that there's code on every line we'd want to break. Patch by Daniil Fukalov. llvm-svn: 205529
* Avoid storing Twines.Benjamin Kramer2014-03-291-22/+19
| | | | | | While there nested ifs into a helper function. No functionality change. llvm-svn: 205108
OpenPOWER on IntegriCloud