summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r180737. The companion patch was reverted, and this is not relevant ↵Bill Wendling2013-05-011-15/+0
| | | | | | right now. llvm-svn: 180889
* This patch breaks up Wrap.h so that it does not have to include all of Filip Pizlo2013-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881
* [inline asm] Return an undef SDValue of the expected value type, rather thanChad Rosier2013-05-011-1/+1
| | | | | | | | | report a fatal error. This allows us to continue processing the translation unit. Test case to come on the clang side because we need an inline asm diagnostics handler in place. rdar://13446483 llvm-svn: 180873
* Optimize away nop CONCAT_VECTOR nodes.Nadav Rotem2013-05-011-0/+39
| | | | | | | | | Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector. rdar://13402653 PR15866 llvm-svn: 180871
* Only pass 'returned' to target-specific lowering code when the value of ↵Stephen Lin2013-04-302-28/+42
| | | | | | entire register is guaranteed to be preserved. llvm-svn: 180825
* Temporarily revert "Change the informal convention of DBG_VALUE so that we ↵Adrian Prantl2013-04-307-58/+36
| | | | | | | | | | can express a" because it breaks some buildbots. This reverts commit 180816. llvm-svn: 180819
* Change the informal convention of DBG_VALUE so that we can express aAdrian Prantl2013-04-307-36/+58
| | | | | | | | | | | | register-indirect address with an offset of 0. It used to be that a DBG_VALUE is a register-indirect value if the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE is register-indirect if the first operand is a register and the second operand is an immediate. For plain registers use the combination reg, reg. rdar://problem/13658587 llvm-svn: 180816
* MI Sched: revert a minor heuristic that snuck in with -misched-vcopy.Andrew Trick2013-04-301-0/+6
| | | | | | I'll fix the heuristic in a general way in a follow-up commit. llvm-svn: 180815
* LocalStackSlotAllocation improvementsHal Finkel2013-04-301-94/+111
| | | | | | | | | | | | First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created. Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway. Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll Jim has okayed this off-list. llvm-svn: 180799
* Emit the TLS initialization function pointers into the correct section.Bill Wendling2013-04-291-0/+15
| | | | | | | | | | The `llvm.tls_init_funcs' (created by the front-end) holds pointers to the TLS initialization functions. These need to be placed into the correct section so that they are run before `main()'. <rdar://problem/13733006> llvm-svn: 180737
* Generalize the MachineTraceMetrics public API.Andrew Trick2013-04-271-1/+15
| | | | | | | Naturally, we should be able to pass in extra instructions, not just extra blocks. llvm-svn: 180667
* Use the target triple from the target machine rather than the moduleEric Christopher2013-04-272-1/+5
| | | | | | | | | | | | | | | | to determine whether or not we're on a darwin platform for debug code emitting. Solves the problem of a module with no triple on the command line and no triple in the module using non-gdb ok features on darwin. Fix up the member-pointers test to check the correct things for cross platform (DW_FORM_flag is a good prefix). Unfortunately no testcase because I have no ideas how to test something without a triple and without a triple in the module yet check precisely on two platforms. Ideas welcome. llvm-svn: 180660
* Cleanup and document MachineLocation.Adrian Prantl2013-04-262-3/+9
| | | | | | | | | | | | Clarify documentation and API to make the difference between register and register-indirect addressed locations more explicit. Put in a comment to point out that with the current implementation we cannot specify a register-indirect location with offset 0 (a breg 0 in DWARF). No functionality change intended. rdar://problem/13658587 llvm-svn: 180641
* Micro-optimizationBill Wendling2013-04-261-5/+4
| | | | | | | TLVs probably won't be as common as the other types of variables. Check for them last before defaulting to "DATA". llvm-svn: 180631
* Re-write the address propagation code for pre-indexed loads/stores to take ↵Silviu Baranga2013-04-261-14/+29
| | | | | | into account some previously misssed cases (PRE_DEC addressing mode, the offset and base address are swapped, etc). This should fix PR15581. llvm-svn: 180609
* DAGCombiner: Canonicalize vector integer abs in the same way we do it for ↵Benjamin Kramer2013-04-261-0/+42
| | | | | | | | | | scalars. This already helps SSE2 x86 a lot because it lacks an efficient way to represent a vector select. The long term goal is to enable the backend to match a canonicalized pattern into a single instruction (e.g. vabs or pabs). llvm-svn: 180597
* [mc-coff] Forward Linker Option flags into the .drectve sectionReid Kleckner2013-04-251-0/+46
| | | | | | | | | | | | | | Summary: This is modelled on the Mach-O linker options implementation and should support a Clang implementation of #pragma comment(lib/linker). Reviewers: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D724 llvm-svn: 180569
* Fix constant folding for one lane vector types. Constant folding one lane ↵Silviu Baranga2013-04-251-1/+1
| | | | | | vector types not returns a vector instead of a scalar. llvm-svn: 180254
* Fix for r180193 - MI Sched: eliminate local vreg.Andrew Trick2013-04-241-2/+6
| | | | | | | | | Fixes PR15838. Need to check for blocks with nothing but dbg.value. I'm not sure how to force this situation with a unit test. I tried to reduce the test case in PR15838 (1k lines of metadata) but gave up. llvm-svn: 180227
* [inline asm] Fix a crasher for an invalid value type/register class.Chad Rosier2013-04-241-4/+11
| | | | | | rdar://13731657 llvm-svn: 180226
* MI Sched: eliminate local vreg copies.Andrew Trick2013-04-241-7/+194
| | | | | | | | | | | | | | | | For now, we just reschedule instructions that use the copied vregs and let regalloc elliminate it. I would really like to eliminate the copies on-the-fly during scheduling, but we need a complete implementation of repairIntervalsInRange() first. The general strategy is for the register coalescer to eliminate as many global copies as possible and shrink live ranges to be extended-basic-block local. The coalescer should not have to worry about resolving local copies (e.g. it shouldn't attemp to reorder instructions). The scheduler is a much better place to deal with local interference. The coalescer side of this equation needs work. llvm-svn: 180193
* Register Coalescing: add a flag to disable rescheduling.Andrew Trick2013-04-241-2/+8
| | | | | | | | When MachineScheduler is enabled, this functionality can be removed. Until then, provide a way to disable it for test cases and designing MachineScheduler heuristics. llvm-svn: 180192
* MI Sched: regpressure tracing.Andrew Trick2013-04-241-0/+8
| | | | llvm-svn: 180191
* Formatting.Eric Christopher2013-04-241-1/+1
| | | | llvm-svn: 180186
* DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider ↵Owen Anderson2013-04-231-1/+3
| | | | | | | | | VSETCC without first checking the target's vector boolean contents. This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents. The included change fixes the PowerPC tests, and was OK'd by Hal. llvm-svn: 180129
* Add some constraints to use of 'returned':Stephen Lin2013-04-231-0/+4
| | | | | | | | | 1) Disallow 'returned' on parameter that is also 'sret' (no sensible semantics, as far as I can tell). 2) Conservatively disallow tail calls through 'returned' parameters that also are 'zext' or 'sext' (for consistency with treatment of other zero-extending and sign-extending operations in tail call position detection...can be revised later to handle situations that can be determined to be safe). This is a new attribute that is not yet used, so there is no impact. llvm-svn: 180118
* Remove unused DwarfSectionOffsetDirective stringMatt Arsenault2013-04-221-1/+1
| | | | | | | The value isn't actually used, and setting it emits a COFF specific directive. llvm-svn: 180064
* Move C++ code out of the C headers and into either C++ headersEric Christopher2013-04-221-0/+1
| | | | | | | or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063
* Optimize MachineBasicBlock::getSymbol by caching the symbol. Since the symbolEli Bendersky2013-04-221-7/+11
| | | | | | | name computation is expensive, this helps save about 25% of the time spent in this function. llvm-svn: 180049
* Clarify that llvm.used can contain aliases.Rafael Espindola2013-04-222-7/+3
| | | | | | | Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019
* Tidy.Eric Christopher2013-04-221-4/+6
| | | | llvm-svn: 180000
* Update comment. Whitespace.Eric Christopher2013-04-221-2/+2
| | | | llvm-svn: 179999
* Revert "Revert "PR14606: debug info imported_module support""David Blaikie2013-04-222-0/+25
| | | | | | | | | | This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even though the debug info was clearly invalid on all of them, but this ought to fix it. llvm-svn: 179996
* Legalize vector truncates by parts rather than just splitting.Jim Grosbach2013-04-212-1/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr llvm-svn: 179989
* Tidy up comment grammar.Jim Grosbach2013-04-211-2/+2
| | | | llvm-svn: 179986
* Remove unused ShouldFoldAtomicFences flag.Tim Northover2013-04-201-1/+0
| | | | | | | | I think it's almost impossible to fold atomic fences profitably under LLVM/C++11 semantics. As a result, this is now unused and just cluttering up the target interface. llvm-svn: 179940
* Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.Tim Northover2013-04-205-71/+1
| | | | llvm-svn: 179939
* Add CodeGen support for functions that always return arguments via a new ↵Stephen Lin2013-04-202-7/+37
| | | | | | parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). llvm-svn: 179925
* Allow tail call opportunity detection through nested and/or multiple ↵Stephen Lin2013-04-201-73/+126
| | | | | | iterations of extractelement/insertelement indirection llvm-svn: 179924
* Simplify the code in FastISel::tryToFoldLoad, add an assertion and fix a ↵Eli Bendersky2013-04-191-17/+10
| | | | | | comment. llvm-svn: 179908
* Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'mEli Bendersky2013-04-192-79/+66
| | | | | | | trying to move as much FastISel logic as possible out of the main path in SelectionDAGISel - intermixing them just adds confusion. llvm-svn: 179902
* ArrayRefize getMachineNode(). No functionality change.Michael Liao2013-04-192-22/+24
| | | | llvm-svn: 179901
* Add an MRI::verifyUseLists() function.Jakob Stoklund Olesen2013-04-192-3/+54
| | | | | | | This checks the sanity of the register use lists in the MI intermediate representation. llvm-svn: 179895
* Use dbgs() consistently for -debug printoutsEli Bendersky2013-04-191-13/+13
| | | | llvm-svn: 179894
* Revert "PR14606: debug info imported_module support"Eric Christopher2013-04-192-25/+0
| | | | | | This reverts commit r179836 as it seems to have caused test failures. llvm-svn: 179840
* PR14606: debug info imported_module supportDavid Blaikie2013-04-192-0/+25
| | | | | | | | | | Adding another CU-wide list, in this case of imported_modules (since they should be relatively rare, it seemed better to add a list where each element had a "context" value, rather than add a (usually empty) list to every scope). This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll need to expand this to cover DW_TAG_imported_declaration too. llvm-svn: 179836
* Add some more stats for fast isel vs. SelectionDAG, w.r.t lowering functionEli Bendersky2013-04-191-1/+10
| | | | | | arguments in entry BBs. llvm-svn: 179824
* Add support for subsections to the ELF assembler. Fixes PR8717.Peter Collingbourne2013-04-171-1/+1
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D598 llvm-svn: 179725
* Replace uses of the deprecated std::auto_ptr with OwningPtr.Andy Gibbs2013-04-151-23/+22
| | | | | | This is a rework of the broken parts in r179373 which were subsequently reverted in r179374 due to incompatibility with C++98 compilers. This version should be ok under C++98. llvm-svn: 179520
* Document the decision to assume that the cost of floats is twice as much as ↵Nadav Rotem2013-04-141-1/+3
| | | | | | integers. llvm-svn: 179478
OpenPOWER on IntegriCloud