summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* PR15868 fix.Stepan Dyatkovskiy2013-05-201-6/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduction: In case when stack alignment is 8 and GPRs parameter part size is not N*8: we add padding to GPRs part, so part's last byte must be recovered at address K*8-1. We need to do it, since remained (stack) part of parameter starts from address K*8, and we need to "attach" "GPRs head" without gaps to it: Stack: |---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes... [ [padding] [GPRs head] ] [ ------ Tail passed via stack ------ ... FIX: Note, once we added padding we need to correct *all* Arg offsets that are going after padded one. That's why we need this fix: Arg offsets were never corrected before this patch. See new test-cases included in patch. We also don't need to insert padding for byval parameters that are stored in GPRs only. We need pad only last byval parameter and only in case it outsides GPRs and stack alignment = 8. Though, stack area, allocated for recovered byval params, must satisfy "Size mod 8 = 0" restriction. This patch reduces stack usage for some cases: We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be "packed" with alignment 4 in some cases. llvm-svn: 182237
* Replace some bit operations with simpler ones. No functionality change.Benjamin Kramer2013-05-191-9/+7
| | | | llvm-svn: 182226
* Add LLVMContext argument to getSetCCResultTypeMatt Arsenault2013-05-181-1/+1
| | | | llvm-svn: 182180
* ARM ISel: Don't create illegal types during LowerMULArnold Schwaighofer2013-05-141-25/+32
| | | | | | | | | | | | | | | | | The transformation happening here is that we want to turn a "mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have to make sure that X still has a valid vector type - possibly recreate an extension to a smaller type. In case of a extload of a memory type smaller than 64 bit we used create a ext(load()). The problem with doing this - instead of recreating an extload - is that an illegal type is exposed. This patch fixes this by creating extloads instead of ext(load()) sequences. Fixes PR15970. radar://13871383 llvm-svn: 181842
* Correctly preserve the input chain for potential tailcall nodes whoseLang Hames2013-05-131-1/+1
| | | | | | | | | | | | return values are bitcasts. The chain had previously been being clobbered with the entry node to the dag, which sometimes caused other code in the function to be erroneously deleted when tailcall optimization kicked in. <rdar://problem/13827621> llvm-svn: 181696
* For r181148: fixed warning 'enumeral and non-enumeral type in conditional ↵Stepan Dyatkovskiy2013-05-081-1/+1
| | | | | | expression'. llvm-svn: 181437
* For ARM backend, fixed "byval" attribute support.Stepan Dyatkovskiy2013-05-051-33/+102
| | | | | | | | | | | | | | | | | | | | | | | Now even the small structures could be passed within byval (small enough to be stored in GPRs). In regression tests next function prototypes are checked: PR15293: %artz = type { i32 } define void @foo(%artz* byval %s) define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2) foo: "s" stored in R0 foo2: "s" stored in R0, "s2" stored in R2. Next AAPCS rules are checked: 5.5 Parameters Passing, C.4 and C.5, "ParamSize" is parameter size in 32bit words: -- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4. Parameter should be sent to the stack; NCRN := R4. -- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4. Parameter stored in GPRs; NCRN += ParamSize. llvm-svn: 181148
* Refactoring patch.Stepan Dyatkovskiy2013-04-301-41/+70
| | | | | | | | | | | | 1. VarArgStyleRegisters: functionality that emits "store" instructions for byval regs moved out into separated method "StoreByValRegs". Before this patch VarArgStyleRegisters had confused use-cases. It was used for both variadic functions and for regular functions with byval parameters. In last case it created new stack-frame and registered it as VarArg frame, that is wrong. This patch replaces VarArgsStyleRegisters usage for byval parameters with StoreByValRegs method. 2. In ARMMachineFunctionInfo, "get/setVarArgsRegSaveSize" was renamed to "get/setArgRegsSaveSize". By the same reason. Sometimes it was used for variadic functions, and sometimes for byval parameters in regular functions. Actually, this property means the size of registers, that keeps arguments, and thats why it was renamed. 3. In ARMISelLowering.cpp, ARMTargetLowering class, in methods computeRegArea and StoreByValRegs, VARegXXXXXX was renamed to ArgRegsXXXXXX still by the same reasons. llvm-svn: 180774
* Add more tests for r179925 to verify correct handling of signext/zeroext; ↵Stephen Lin2013-04-231-3/+6
| | | | | | strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change) llvm-svn: 180138
* Lowercase "is" boolean variable prefix for consistency within function, no ↵Stephen Lin2013-04-231-12/+12
| | | | | | functionality change. llvm-svn: 180136
* Fix for 5.5 Parameter Passing --> Stage C:Stepan Dyatkovskiy2013-04-221-0/+1
| | | | | | | | | | | | | | | -- C.4 and C.5 statements, when NSAA is not equal to SP. -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a variadic procedure. Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are some exceptions in AAPCS. 1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs. 2. Check that for VA functions all params uses GPRs and then stack. No exceptions, no CPRCs here. llvm-svn: 180011
* Remove unused ShouldFoldAtomicFences flag.Tim Northover2013-04-201-2/+0
| | | | | | | | I think it's almost impossible to fold atomic fences profitably under LLVM/C++11 semantics. As a result, this is now unused and just cluttering up the target interface. llvm-svn: 179940
* Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.Tim Northover2013-04-201-32/+0
| | | | llvm-svn: 179939
* Add CodeGen support for functions that always return arguments via a new ↵Stephen Lin2013-04-201-5/+28
| | | | | | parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). llvm-svn: 179925
* Test commitStephen Lin2013-04-201-1/+1
| | | | llvm-svn: 179913
* Remove the old CodePlacementOpt pass.Benjamin Kramer2013-03-291-2/+0
| | | | | | It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1. llvm-svn: 178349
* Improve long vector sext/zext lowering on ARMRenato Golin2013-03-191-0/+55
| | | | | | | | | | | | | The ARM backend currently has poor codegen for long sext/zext operations, such as v8i8 -> v8i32. This patch addresses this by performing a custom expansion in ARMISelLowering. It also adds/changes the cost of such lowering in ARMTTI. This partially addresses PR14867. Patch by Pete Couperus llvm-svn: 177380
* ARM: Creating a vector from a lane of another.Jim Grosbach2013-03-021-2/+5
| | | | | | | | | | | The VDUP instruction source register doesn't allow a non-constant lane index, so make sure we don't construct a ARM::VDUPLANE node asking it to do so. rdar://13328063 http://llvm.org/bugs/show_bug.cgi?id=13963 llvm-svn: 176413
* Clean up code format a bit.Jim Grosbach2013-03-021-4/+2
| | | | llvm-svn: 176412
* Tidy up. Trailing whitespace.Jim Grosbach2013-03-021-7/+7
| | | | llvm-svn: 176411
* ARM NEON: Fix v2f32 float intrinsicsArnold Schwaighofer2013-03-021-0/+18
| | | | | | Mark them as expand, they are not legal as our backend does not match them. llvm-svn: 176410
* Add support for using non-pic code for arm and thumb1 when emitting the sjljChad Rosier2013-03-011-10/+21
| | | | | | | | dispatch code. As far as I can tell the thumb2 code is behaving as expected. I was able to compile and run the associated test case for both arm and thumb1. rdar://13066352 llvm-svn: 176363
* Tidy up; no functional change.Chad Rosier2013-02-281-5/+3
| | | | llvm-svn: 176288
* Style; no functional change.Chad Rosier2013-02-281-7/+4
| | | | llvm-svn: 176285
* ARM: FMA is legal only if VFP4 is available.Jim Grosbach2013-02-271-0/+6
| | | | | | rdar://13306723 llvm-svn: 176212
* Remove this instance of dl as it's defined in a previous scope.Chad Rosier2013-02-271-1/+0
| | | | llvm-svn: 176208
* Update TargetLowering ivars for name policy.Jim Grosbach2013-02-201-8/+8
| | | | | | | | | | | http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly ivars should be camel-case and start with an upper-case letter. A few in TargetLowering were starting with a lower-case letter. No functional change intended. llvm-svn: 175667
* Test commit. Fixed typo.David Peixotto2013-02-131-1/+1
| | | | llvm-svn: 175020
* ARM NEON: Handle v16i8 and v8i16 reverse shufflesArnold Schwaighofer2013-02-121-1/+37
| | | | | | | | | | | | | | | Lower reverse shuffles to a vrev64 and a vext instruction instead of the default legalization of storing and loading to the stack. This is important because we generate reverse shuffles in the loop vectorizer when we reverse store to an array. uint8_t Arr[N]; for (i = 0; i < N; ++i) Arr[N - i - 1] = ... radar://13171760 llvm-svn: 174929
* Move MRI liveouts to ARM return instructions.Jakob Stoklund Olesen2013-02-051-13/+11
| | | | llvm-svn: 174406
* Add a special ARM trap encoding for NaCl.Eli Bendersky2013-01-301-2/+10
| | | | | | | | More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html Patch by JF Bastien llvm-svn: 173943
* Fix 64-bit atomic operations in Thumb mode.Tim Northover2013-01-291-74/+46
| | | | | | | | The ARM and Thumb variants of LDREXD and STREXD have different constraints and take different operands. Previously the code expanding atomic operations didn't take this into account and asserted in Thumb mode. llvm-svn: 173780
* Teach SDISel to combine fsin / fcos into a fsincos node if the followingEvan Cheng2013-01-291-0/+2
| | | | | | | | | | | | | | | | | | conditions are met: 1. They share the same operand and are in the same BB. 2. Both outputs are used. 3. The target has a native instruction that maps to ISD::FSINCOS node or the target provides a sincos library call. Implemented the generic optimization in sdisel and enabled it for Mac OSX. Also added an additional optimization for x86_64 Mac OSX by using an alternative entry point __sincos_stret which returns the two results in xmm0 / xmm1. rdar://13087969 PR13204 llvm-svn: 173755
* Fixed the condition codes for the atomic64 min/umin code generation on ARM. ↵Silviu Baranga2013-01-251-2/+2
| | | | | | If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation. llvm-svn: 173437
* Switch TargetTransformInfo from an immutable analysis pass that requiresChandler Carruth2013-01-071-32/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis *pass*, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it *always* being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-8/+8
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* Remove the Function::getFnAttributes method in favor of using the AttributeSetBill Wendling2012-12-301-5/+7
| | | | | | | | | directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253
* Revert "Adding support for llvm.arm.neon.vaddl[su].* and"Bob Wilson2012-12-201-18/+0
| | | | | | | This reverts r170694. The operations can be represented in IR without adding any new intrinsics. llvm-svn: 170765
* Adding support for llvm.arm.neon.vaddl[su].* andRenato Golin2012-12-201-0/+18
| | | | | | | | llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> llvm-svn: 170694
* Remove the explicit MachineInstrBuilder(MI) constructor.Jakob Stoklund Olesen2012-12-191-1/+1
| | | | | | | | | | | | | Use the version that also takes an MF reference instead. It would technically be possible to extract an MF reference from the MI as MI->getParent()->getParent(), but that would not work for MIs that are not inserted into any basic block. Given the reasonably small number of places this constructor was used at all, I preferred the compile time check to a run time assertion. llvm-svn: 170588
* Change TargetLowering::findRepresentativeClass to take an MVT, insteadPatrik Hagglund2012-12-191-2/+2
| | | | | | of EVT. llvm-svn: 170532
* Rename the 'Attributes' class to 'Attribute'. It's going to represent a ↵Bill Wendling2012-12-191-3/+3
| | | | | | single attribute in the future. llvm-svn: 170502
* Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.Patrik Hagglund2012-12-131-1/+1
| | | | | | | | | | | | Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. This is the second attempt. In the first attempt (r169837), a few getSimpleVT() were hoisted too far, detected by bootstrap failures. llvm-svn: 170104
* Sorry about the churn. One more change to getOptimalMemOpType() hook. Did IEvan Cheng2012-12-121-2/+2
| | | | | | | | | | | | mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. llvm-svn: 169959
* - Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term.Evan Cheng2012-12-121-6/+2
| | | | | | | | | Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. llvm-svn: 169954
* Avoid using lossy load / stores for memcpy / memset expansion. e.g.Evan Cheng2012-12-121-0/+4
| | | | | | f64 load / store on non-SSE2 x86 targets. llvm-svn: 169944
* Replace TargetLowering::isIntImmLegal() withEvan Cheng2012-12-111-18/+33
| | | | | | | | | ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined term for something like integer immediate materialization. It is always possible to materialize an integer immediate. Whether to use it for memcpy expansion is more a "cost" conceern. llvm-svn: 169929
* Revert EVT->MVT changes, r169836-169851, due to buildbot failures.Patrik Hagglund2012-12-111-3/+3
| | | | llvm-svn: 169854
* Change TargetLowering::findRepresentativeClass to take an MVT, insteadPatrik Hagglund2012-12-111-2/+2
| | | | | | of EVT. llvm-svn: 169845
* Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.Patrik Hagglund2012-12-111-1/+1
| | | | | | | | | Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. llvm-svn: 169837
OpenPOWER on IntegriCloud