summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* The C++ exception handling personality function wantsDuncan Sands2007-12-191-2/+6
| | | | | | | | | | | | | | | | | | | | | | | to know about calls that cannot throw ('nounwind'): if such a call does throw for some reason then the personality will terminate the program. The distinction between an ordinary call and a nounwind call is that an ordinary call gets an entry in the exception table but a nounwind call does not. This patch sets up the exception table appropriately. One oddity is that I've chosen to bracket nounwind calls with labels (like invokes) - the other choice would have been to bracket ordinary calls with labels. While bracketing ordinary calls is more natural (because bracketing by labels would then correspond exactly to getting an entry in the exception table), I didn't do it because introducing labels impedes some optimizations and I'm guessing that ordinary calls occur more often than nounwind calls. This fixes the gcc filter2 eh test, at least at -O0 (the inliner needs some tweaking at higher optimization levels). llvm-svn: 45197
* Make invokes of inline asm legal. Teach codegenDuncan Sands2007-12-171-17/+20
| | | | | | | | how to lower them (with no attempt made to be efficient, since they should only occur for unoptimized code). llvm-svn: 45108
* Rather than having special rules like "intrinsics cannotDuncan Sands2007-12-031-21/+3
| | | | | | | | | throw exceptions", just mark intrinsics with the nounwind attribute. Likewise, mark intrinsics as readnone/readonly and get rid of special aliasing logic (which didn't use anything more than this anyway). llvm-svn: 44544
* Add some convenience methods for querying attributes, andDuncan Sands2007-11-281-12/+11
| | | | | | use them. llvm-svn: 44403
* Fix PR1146: parameter attributes are longer part ofDuncan Sands2007-11-271-12/+10
| | | | | | | | | | | | the function type, instead they belong to functions and function calls. This is an updated and slightly corrected version of Reid Spencer's original patch. The only known problem is that auto-upgrading of bitcode files doesn't seem to work properly (see test/Bitcode/AutoUpgradeIntrinsics.ll). Hopefully a bitcode guru (who might that be? :) ) will fix it. llvm-svn: 44359
* err, no really.Chris Lattner2007-11-271-1/+1
| | | | llvm-svn: 44352
* don't depend on ADL.Chris Lattner2007-11-271-1/+1
| | | | llvm-svn: 44351
* Several changes:Chris Lattner2007-11-241-9/+1
| | | | | | | | | | | | | | | | | | | | | 1) Change the interface to TargetLowering::ExpandOperationResult to take and return entire NODES that need a result expanded, not just the value. This allows us to handle things like READCYCLECOUNTER, which returns two values. 2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES. 3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new ExpandOperationResult. This makes the result simpler and fully general. 4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes. 5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM i64 shifts, allowing them to work with LegalizeDAGTypes. 6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT, allowing them to work with LegalizeDAGTypes. LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when type legalization in LegalizeDAG is ifdef'd out. llvm-svn: 44300
* Implement necessary bits for flt_rounds gcc builtin. Anton Korobeynikov2007-11-151-0/+4
| | | | | | Codegen bits and llvm-gcc support will follow. llvm-svn: 44182
* This assertion was bogus.Duncan Sands2007-11-151-3/+2
| | | | llvm-svn: 44167
* Make labels work in asm blocks; allow labels asDale Johannesen2007-11-051-22/+28
| | | | | | | parameters. Rename ValueRefList to ParamList in AsmParser, since its only use is for parameters. llvm-svn: 43734
* Add std:: to sort calls.Dan Gohman2007-11-021-1/+1
| | | | llvm-svn: 43652
* Change illegal uses of ++ to uses of STLExtra.h's next function.Dan Gohman2007-11-021-1/+1
| | | | llvm-svn: 43651
* Executive summary: getTypeSize -> getTypeStoreSize / getABITypeSize.Duncan Sands2007-11-011-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The meaning of getTypeSize was not clear - clarifying it is important now that we have x86 long double and arbitrary precision integers. The issue with long double is that it requires 80 bits, and this is not a multiple of its alignment. This gives a primitive type for which getTypeSize differed from getABITypeSize. For arbitrary precision integers it is even worse: there is the minimum number of bits needed to hold the type (eg: 36 for an i36), the maximum number of bits that will be overwriten when storing the type (40 bits for i36) and the ABI size (i.e. the storage size rounded up to a multiple of the alignment; 64 bits for i36). This patch removes getTypeSize (not really - it is still there but deprecated to allow for a gradual transition). Instead there is: (1) getTypeSizeInBits - a number of bits that suffices to hold all values of the type. For a primitive type, this is the minimum number of bits. For an i36 this is 36 bits. For x86 long double it is 80. This corresponds to gcc's TYPE_PRECISION. (2) getTypeStoreSizeInBits - the maximum number of bits that is written when storing the type (or read when reading it). For an i36 this is 40 bits, for an x86 long double it is 80 bits. This is the size alias analysis is interested in (getTypeStoreSize returns the number of bytes). There doesn't seem to be anything corresponding to this in gcc. (3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded up to a multiple of the alignment. For an i36 this is 64, for an x86 long double this is 96 or 128 depending on the OS. This is the spacing between consecutive elements when you form an array out of this type (getABITypeSize returns the number of bytes). This is TYPE_SIZE in gcc. Since successive elements in a SequentialType (arrays, pointers and vectors) need to be aligned, the spacing between them will be given by getABITypeSize. This means that the size of an array is the length times the getABITypeSize. It also means that GEP computations need to use getABITypeSize when computing offsets. Furthermore, if an alloca allocates several elements at once then these too need to be aligned, so the size of the alloca has to be the number of elements multiplied by getABITypeSize. Logically speaking this doesn't have to be the case when allocating just one element, but it is simpler to also use getABITypeSize in this case. So alloca's and mallocs should use getABITypeSize. Finally, since gcc's only notion of size is that given by getABITypeSize, if you want to output assembler etc the same as gcc then getABITypeSize is the size you want. Since a store will overwrite no more than getTypeStoreSize bytes, and a read will read no more than that many bytes, this is the notion of size appropriate for alias analysis calculations. In this patch I have corrected all type size uses except some of those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard cases). I will get around to auditing these too at some point, but I could do with some help. Finally, I made one change which I think wise but others might consider pointless and suboptimal: in an unpacked struct the amount of space allocated for a field is now given by the ABI size rather than getTypeStoreSize. I did this because every other place that reserves memory for a type (eg: alloca) now uses getABITypeSize, and I didn't want to make an exception for unpacked structs, i.e. I did it to make things more uniform. This only effects structs containing long doubles and arbitrary precision integers. If someone wants to pack these types more tightly they can always use a packed struct. llvm-svn: 43620
* - Remove the hacky code that forces a memcpy. Alignment is taken care of in theBill Wendling2007-10-261-11/+3
| | | | | | | | | FE. - Explicitly pass in the alignment of the load & store. - XFAIL 2007-10-23-UnalignedMemcpy.ll because llc has a bug that crashes on unaligned pointers. llvm-svn: 43398
* Fix comment and use the "Size" variable that's already provided.Bill Wendling2007-10-231-10/+5
| | | | llvm-svn: 43271
* If there's an unaligned memcpy to/from the stack, don't lower it. Just call theBill Wendling2007-10-231-0/+13
| | | | | | memcpy library function instead. llvm-svn: 43270
* This broke lots. Reverting.Bill Wendling2007-10-231-4/+0
| | | | llvm-svn: 43264
* Lowering a memcpy to the stack is killing PPC. The ARM and X86 backends alreadyBill Wendling2007-10-231-0/+4
| | | | | | | | have their own custom memcpy lowering code. This code needs to be factored out into a target-independent lowering method with hooks to the backend. In the meantime, just call memcpy if we're trying to copy onto a stack. llvm-svn: 43262
* rename ExpandOperation to ExpandOperationResult, as suggestedChris Lattner2007-10-191-1/+1
| | | | | | by Duncan llvm-svn: 43177
* Add support for byval function whose argument is not 32 bit aligned.Rafael Espindola2007-10-191-1/+16
| | | | | | | | | | To do this it is necessary to add a "always inline" argument to the memcpy node. For completeness I have also added this node to memmove and memset. I have also added getMem* functions, because the extra argument makes it cumbersome to use getNode and because I get confused by it :-) llvm-svn: 43172
* add a new target hook.Chris Lattner2007-10-191-0/+8
| | | | llvm-svn: 43165
* One mundane change: Change ReplaceAllUsesOfValueWith to *optionally* Chris Lattner2007-10-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | take a deleted nodes vector, instead of requiring it. One more significant change: Implement the start of a legalizer that just works on types. This legalizer is designed to run before the operation legalizer and ensure just that the input dag is transformed into an output dag whose operand and result types are all legal, even if the operations on those types are not. This design/impl has the following advantages: 1. When finished, this will *significantly* reduce the amount of code in LegalizeDAG.cpp. It will remove all the code related to promotion and expansion as well as splitting and scalarizing vectors. 2. The new code is very simple, idiomatic, and modular: unlike LegalizeDAG.cpp, it has no 3000 line long functions. :) 3. The implementation is completely iterative instead of recursive, good for hacking on large dags without blowing out your stack. 4. The implementation updates nodes in place when possible instead of deallocating and reallocating the entire graph that points to some mutated node. 5. The code nicely separates out handling of operations with invalid results from operations with invalid operands, making some cases simpler and easier to understand. 6. The new -debug-only=legalize-types option is very very handy :), allowing you to easily understand what legalize types is doing. This is not yet done. Until the ifdef added to SelectionDAGISel.cpp is enabled, this does nothing. However, this code is sufficient to legalize all of the code in 186.crafty, olden and freebench on an x86 machine. The biggest issues are: 1. Vectors aren't implemented at all yet 2. SoftFP is a mess, I need to talk to Evan about it. 3. No lowering to libcalls is implemented yet. 4. Various operations are missing etc. 5. There are FIXME's for stuff I hax0r'd out, like softfp. Hey, at least it is a step in the right direction :). If you'd like to help, just enable the #ifdef in SelectionDAGISel.cpp and compile code with it. If this explodes it will tell you what needs to be implemented. Help is certainly appreciated. Once this goes in, we can do three things: 1. Add a new pass of dag combine between the "type legalizer" and "operation legalizer" passes. This will let us catch some long-standing isel issues that we miss because operation legalization often obfuscates the dag with target-specific nodes. 2. We can rip out all of the type legalization code from LegalizeDAG.cpp, making it much smaller and simpler. When that happens we can then reimplement the core functionality left in it in a much more efficient and non-recursive way. 3. Once the whole legalizer is non-recursive, we can implement whole-function selectiondags maybe... llvm-svn: 42981
* Corrected many typing errors. And removed 'nest' parameter handlingArnold Schwaighofer2007-10-121-2/+2
| | | | | | | for fastcc from X86CallingConv.td. This means that nested functions are not supported for calling convention 'fastcc'. llvm-svn: 42934
* Fix some corner cases with vectors in copyToRegs and copyFromRegs.Dan Gohman2007-10-121-1/+16
| | | | llvm-svn: 42907
* Add intrinsics for sin, cos, and pow. These use llvm_anyfloat_ty, and soDan Gohman2007-10-121-0/+16
| | | | | | | may be overloaded with vector types. And add a testcase for codegen for these. llvm-svn: 42885
* Added tail call optimization to the x86 back end. It can beArnold Schwaighofer2007-10-111-0/+48
| | | | | | | | | | | enabled by passing -tailcallopt to llc. The optimization is performed if the following conditions are satisfied: * caller/callee are fastcc * elf/pic is disabled OR elf/pic enabled + callee is in module + callee has visibility protected or hidden llvm-svn: 42870
* In -debug mode, dump SelectionDAGs both before and after theDan Gohman2007-10-081-1/+7
| | | | | | optimization passes. llvm-svn: 42749
* Rewrite sqrt and powi to use anyfloat. By popular demand.Dale Johannesen2007-10-021-10/+2
| | | | llvm-svn: 42537
* Fix stride computations for long double arrays.Dale Johannesen2007-10-011-2/+2
| | | | llvm-svn: 42508
* Add sqrt and powi intrinsics for long double.Dale Johannesen2007-09-281-0/+6
| | | | llvm-svn: 42423
* Enable codegen for long double abs, sin, cosDale Johannesen2007-09-261-3/+6
| | | | llvm-svn: 42368
* Remove the assumption that FP's are either float orDale Johannesen2007-09-141-3/+2
| | | | | | | | | | | | | double from some of the many places in the optimizers it appears, and do something reasonable with x86 long double. Make APInt::dump() public, remove newline, use it to dump ConstantSDNode's. Allow APFloats in FoldingSet. Expand X86 backend handling of long doubles (conversions to/from int, mostly). llvm-svn: 41967
* Fold the adjust_trampoline intrinsic intoDuncan Sands2007-09-111-7/+7
| | | | | | | init_trampoline. There is now only one trampoline intrinsic. llvm-svn: 41841
* 1. Don't call Value::getName(), which is slow.Chris Lattner2007-09-101-38/+52
| | | | | | | | | | | | | | | | | | | 2. Lower calls to fabs and friends to FABS nodes etc unless the function has internal linkage. Before we wouldn't lower if it had a definition, which is incorrect. This allows us to compile: define double @fabs(double %f) { %tmp2 = tail call double @fabs( double %f ) ret double %tmp2 } into: _fabs: fabs f1, f1 blr llvm-svn: 41805
* Add support for having different alignment for objects on call frames.Rafael Espindola2007-09-071-2/+4
| | | | | | | The x86-64 ABI states that objects passed on the stack have 8 byte alignment. Implement that. llvm-svn: 41768
* Split eh.select / eh.typeid.for intrinsics into i32/i64 versions. This is ↵Anton Korobeynikov2007-09-071-10/+17
| | | | | | | | needed, because they just "mark" register liveins and we let frontend solve type issue, not lowering code :) llvm-svn: 41763
* Next round of APFloat changes.Dale Johannesen2007-09-061-2/+3
| | | | | | | | | | | | | | Use APFloat in UpgradeParser and AsmParser. Change all references to ConstantFP to use the APFloat interface rather than double. Remove the ConstantFP double interfaces. Use APFloat functions for constant folding arithmetic and comparisons. (There are still way too many places APFloat is just a wrapper around host float/double, but we're getting there.) llvm-svn: 41747
* Fix PR1628. When exception handling is turned on,Duncan Sands2007-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | labels are generated bracketing each call (not just invokes). This is used to generate entries in the exception table required by the C++ personality. However it gets in the way of tail-merging. This patch solves the problem by no longer placing labels around ordinary calls. Instead we generate entries in the exception table that cover every instruction in the function that wasn't covered by an invoke range (the range given by the labels around the invoke). As an optimization, such entries are only generated for parts of the function that contain a call, since for the moment those are the only instructions that can throw an exception [1]. As a happy consequence, we now get a smaller exception table, since the same region can cover many calls. While there, I also implemented folding of invoke ranges - successive ranges are merged when safe to do so. Finally, if a selector contains only a cleanup, there's a special shorthand for it - place a 0 in the call-site entry. I implemented this while there. As a result, the exception table output (excluding filters) is now optimal - it cannot be made smaller [2]. The problem with throw filters is that folding them optimally is hard, and the benefit of folding them is minimal. [1] I tested that having trapping instructions (eg divide by zero) in such a region doesn't cause trouble. [2] It could be made smaller with the help of higher layers, eg by having branch folding reorder basic blocks ending in invokes with the same landing pad so they follow each other. I don't know if this is worth doing. llvm-svn: 41718
* Fix for PR1632. EHSELECTION always produces a i32 value.Evan Cheng2007-09-041-1/+1
| | | | llvm-svn: 41712
* Add an option, -view-sunit-dags, for viewing the actual SUnit DAGs used byDan Gohman2007-08-281-1/+7
| | | | | | scheduling. llvm-svn: 41556
* If the source and destination pointers in an llvm.memmove are knownDan Gohman2007-08-271-12/+25
| | | | | | to not alias each other, it can be translated as an llvm.memcpy. llvm-svn: 41489
* There is an impedance matching problem between LLVM andDuncan Sands2007-08-271-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc exception handling: if an exception unwinds through an invoke, then execution must branch to the invoke's unwind target. We previously tried to enforce this by appending a cleanup action to every selector, however this does not always work correctly due to an optimization in the C++ unwinding runtime: if only cleanups would be run while unwinding an exception, then the program just terminates without actually executing the cleanups, as invoke semantics would require. I was hoping this wouldn't be a problem, but in fact it turns out to be the cause of all the remaining failures in the LLVM testsuite (these also fail with -enable-correct-eh-support, so turning on -enable-eh didn't make things worse!). Instead we need to append a full-blown catch-all to the end of each selector. The correct way of doing this depends on the personality function, i.e. it is language dependent, so can only be done by gcc. Thus this patch which generalizes the eh.selector intrinsic so that it can handle all possible kinds of action table entries (before it didn't accomodate cleanups): now 0 indicates a cleanup, and filters have to be specified using the number of type infos plus one rather than the number of type infos. Related gcc patches will cause Ada to pass a cleanup (0) to force the selector to always fire, while C++ will use a C++ catch-all (null). llvm-svn: 41484
* rename isOperandValidForConstraint to LowerAsmOperandForConstraint, Chris Lattner2007-08-251-6/+6
| | | | | | changing the interface to allow for future changes. llvm-svn: 41384
* Perform correct codegen for eh_dwarf_cfa intrinsic.Anton Korobeynikov2007-08-231-2/+10
| | | | llvm-svn: 41316
* Partial implementation of calling functions with byval arguments:Rafael Espindola2007-08-201-0/+10
| | | | | | | *) The needed information is propagated to the DAG *) The X86-64 backend detects it and aborts llvm-svn: 41179
* - If a dynamic_stackalloc alignment requirement is <= stack alignment, then ↵Evan Cheng2007-08-161-12/+12
| | | | | | | | | the alignment argument is ignored. - *Always* round up the size of the allocation to multiples of stack alignment to ensure the stack ptr is never left in an invalid state after a dynamic_stackalloc. llvm-svn: 41132
* Fix EXTRACT_ELEMENT, EXTRACT_SUBVECTOR, and EXTRACT_VECTOR_ELT toDan Gohman2007-08-101-3/+5
| | | | | | | use an intptr ValueType instead of i32 for the index operand in getCopyToParts. llvm-svn: 40987
* propagate struct size and alignment of byval arguments to the DAGRafael Espindola2007-08-101-1/+8
| | | | llvm-svn: 40986
* This is the patch to provide clean intrinsic function overloading support in ↵Chandler Carruth2007-08-041-12/+0
| | | | | | | | LLVM. It cleans up the intrinsic definitions and generally smooths the process for more complicated intrinsic writing. It will be used by the upcoming atomic intrinsics as well as vector and float intrinsics in the future. This also changes the syntax for llvm.bswap, llvm.part.set, llvm.part.select, and llvm.ct* intrinsics. They are automatically upgraded by both the LLVM ASM reader and the bitcode reader. The test cases have been updated, with special tests added to ensure the automatic upgrading is supported. llvm-svn: 40807
OpenPOWER on IntegriCloud