Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Clarify getRelocationAddress x getRelocationOffset a bit. | Rafael Espindola | 2013-04-25 | 2 | -0/+18 |
| | | | | | | | | | | getRelocationAddress is for dynamic libraries and executables, getRelocationOffset for relocatable objects. Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a test of ELF's. llvm-readobj -r now prints the same values as readelf -r. llvm-svn: 180259 | ||||
* | Fix constant folding for one lane vector types. Constant folding one lane ↵ | Silviu Baranga | 2013-04-25 | 1 | -0/+18 |
| | | | | | | vector types not returns a vector instead of a scalar. llvm-svn: 180254 | ||||
* | Test case for r180241. | Akira Hatanaka | 2013-04-25 | 1 | -0/+22 |
| | | | | llvm-svn: 180246 | ||||
* | Test case for r180238. | Akira Hatanaka | 2013-04-25 | 1 | -0/+22 |
| | | | | llvm-svn: 180245 | ||||
* | R600: Use SHT_PROGBITS for the .AMDGPU.config section | Tom Stellard | 2013-04-24 | 1 | -0/+1 |
| | | | | | | | | The libelf implementation that is distributed here: http://www.mr511.de/software/english.html will not parse sections that are marked SHT_NULL. llvm-svn: 180230 | ||||
* | Mips assembler: Add 64 bit testing for JAL | Jack Carter | 2013-04-24 | 1 | -39/+82 |
| | | | | | Contributer: Vladimir Medic llvm-svn: 180220 | ||||
* | Use pointers to iterate over symbols. | Rafael Espindola | 2013-04-24 | 2 | -52/+52 |
| | | | | | | | | While here, don't report a dummy symbol for relocations that don't have symbols. We used to says such relocations were for the first defined symbol, but now we return end_symbols(). The llvm-readobj output change agrees with otool. llvm-svn: 180214 | ||||
* | LoopVectorize: Scalarize padded types | Arnold Schwaighofer | 2013-04-24 | 1 | -0/+29 |
| | | | | | | | | | | | | | | | | | | This patch disables memory-instruction vectorization for types that need padding bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on x86_64. Because the load/store vectorization is performed by the bit casting to a packed vector, which has incompatible memory layout due to the lack of padding bytes, the present vectorizer produces inconsistent result for memory instructions of those types. This patch checks an equality of the AllocSize of a scalar type and allocated size for each vector element, to ensure that there is no padding bytes and the array can be read/written using vector operations. Patch by Daisuke Takahashi! Fixes PR15758. llvm-svn: 180196 | ||||
* | LoopVectorizer: Bail out if we don't have datalayout we need it | Arnold Schwaighofer | 2013-04-24 | 3 | -1/+8 |
| | | | | llvm-svn: 180195 | ||||
* | MI Sched: eliminate local vreg copies. | Andrew Trick | 2013-04-24 | 1 | -0/+30 |
| | | | | | | | | | | | | | | | | For now, we just reschedule instructions that use the copied vregs and let regalloc elliminate it. I would really like to eliminate the copies on-the-fly during scheduling, but we need a complete implementation of repairIntervalsInRange() first. The general strategy is for the register coalescer to eliminate as many global copies as possible and shrink live ranges to be extended-basic-block local. The coalescer should not have to worry about resolving local copies (e.g. it shouldn't attemp to reorder instructions). The scheduler is a much better place to deal with local interference. The coalescer side of this equation needs work. llvm-svn: 180193 | ||||
* | Cleanup testcase and ensure we actually exercise the inliner. | Adrian Prantl | 2013-04-24 | 1 | -138/+144 |
| | | | | | | rdar://problem/12415623 llvm-svn: 180168 | ||||
* | Hexagon: Use multiclass for combine and STri[bhwd]_shl_V4 instructions. | Jyotsna Verma | 2013-04-23 | 1 | -0/+45 |
| | | | | llvm-svn: 180145 | ||||
* | Make sure the instruction right after an inlined function has a | Adrian Prantl | 2013-04-23 | 1 | -0/+165 |
| | | | | | | | | | | debug location. This solves a problem where range of an inlined subroutine is emitted wrongly. Patch by Manman Ren. Fixes rdar://problem/12415623 llvm-svn: 180140 | ||||
* | Add more tests for r179925 to verify correct handling of signext/zeroext; ↵ | Stephen Lin | 2013-04-23 | 1 | -0/+64 |
| | | | | | | strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change) llvm-svn: 180138 | ||||
* | Fix typo. | Rafael Espindola | 2013-04-23 | 2 | -4/+4 |
| | | | | llvm-svn: 180137 | ||||
* | Hexagon: Remove assembler mapped instruction definitions. | Jyotsna Verma | 2013-04-23 | 1 | -0/+87 |
| | | | | llvm-svn: 180133 | ||||
* | R600: Use .AMDGPU.config section to emit stacksize | Vincent Lejeune | 2013-04-23 | 1 | -0/+15 |
| | | | | llvm-svn: 180124 | ||||
* | R600: Add CF_END | Vincent Lejeune | 2013-04-23 | 3 | -3/+3 |
| | | | | llvm-svn: 180123 | ||||
* | LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure ↵ | Nadav Rotem | 2013-04-23 | 1 | -0/+36 |
| | | | | | | | | that the order in which the elements are scalarized is the same as the original order. This fixes a miscompilation in FreeBSD's regex library. llvm-svn: 180121 | ||||
* | Hexagon: Remove duplicate instructions to handle global/immediate values | Jyotsna Verma | 2013-04-23 | 1 | -0/+18 |
| | | | | | | for absolute/absolute-set addressing modes. llvm-svn: 180120 | ||||
* | Call the potentially costly isAnnotatedParallel() only once. | Pekka Jaaskelainen | 2013-04-23 | 1 | -1/+2 |
| | | | | | | Made the uniform write test's checks a bit stricter. llvm-svn: 180119 | ||||
* | Write relocations in yaml2obj. | Rafael Espindola | 2013-04-23 | 1 | -1/+21 |
| | | | | llvm-svn: 180115 | ||||
* | Move test from grep to FileCheck. | Rafael Espindola | 2013-04-23 | 1 | -2/+5 |
| | | | | llvm-svn: 180092 | ||||
* | Use zlib to uncompress debug sections in DWARF parser. | Alexey Samsonov | 2013-04-23 | 6 | -0/+41 |
| | | | | | | | This makes llvm-dwarfdump and llvm-symbolizer understand debug info sections compressed by ld.gold linker. llvm-svn: 180088 | ||||
* | Refuse to (even try to) vectorize loops which have uniform writes, | Pekka Jaaskelainen | 2013-04-23 | 1 | -0/+58 |
| | | | | | | | | | even if erroneously annotated with the parallel loop metadata. Fixes Bug 15794: "Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata" llvm-svn: 180081 | ||||
* | Add test case for PR15779, which has previously been fixed. | Chad Rosier | 2013-04-22 | 1 | -1/+2 |
| | | | | llvm-svn: 180058 | ||||
* | Changed back (relative to commit 179786) the operations executed when ↵ | Anat Shemer | 2013-04-22 | 1 | -0/+18 |
| | | | | | | extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users. llvm-svn: 180045 | ||||
* | [mips] In performDSPShiftCombine, check that all elements in the vector are | Akira Hatanaka | 2013-04-22 | 1 | -0/+56 |
| | | | | | | | shifted by the same amount and the shift amount is smaller than the element size. llvm-svn: 180039 | ||||
* | COFF: Fix weak external aliases. | Peter Collingbourne | 2013-04-22 | 1 | -0/+18 |
| | | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D700 llvm-svn: 180034 | ||||
* | Extra paranoid test for r179925 (verify that tail calls are not generated to ↵ | Stephen Lin | 2013-04-22 | 1 | -0/+14 |
| | | | | | | 'this'-returning constructors of objects with different 'this' pointers than the caller) llvm-svn: 180032 | ||||
* | Also verify llvm.compiler_used. | Rafael Espindola | 2013-04-22 | 1 | -0/+6 |
| | | | | llvm-svn: 180020 | ||||
* | Clarify that llvm.used can contain aliases. | Rafael Espindola | 2013-04-22 | 6 | -0/+30 |
| | | | | | | | Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019 | ||||
* | Fix for 5.5 Parameter Passing --> Stage C: | Stepan Dyatkovskiy | 2013-04-22 | 3 | -0/+184 |
| | | | | | | | | | | | | | | | -- C.4 and C.5 statements, when NSAA is not equal to SP. -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a variadic procedure. Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are some exceptions in AAPCS. 1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs. 2. Check that for VA functions all params uses GPRs and then stack. No exceptions, no CPRCs here. llvm-svn: 180011 | ||||
* | Add .ll as a valid test suffix for Object, this allows .ll -> object | Eric Christopher | 2013-04-22 | 1 | -1/+1 |
| | | | | | | and then dumping as tests. llvm-svn: 180010 | ||||
* | Cleanup: test source files do not need to be executable | Arnaud A. de Grandmaison | 2013-04-22 | 12 | -0/+0 |
| | | | | llvm-svn: 180003 | ||||
* | Revert "Revert "PR14606: debug info imported_module support"" | David Blaikie | 2013-04-22 | 85 | -98/+126 |
| | | | | | | | | | | This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even though the debug info was clearly invalid on all of them, but this ought to fix it. llvm-svn: 179996 | ||||
* | Legalize vector truncates by parts rather than just splitting. | Jim Grosbach | 2013-04-21 | 2 | -34/+16 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr llvm-svn: 179989 | ||||
* | ARM: Split out cost model vcvt testcases. | Jim Grosbach | 2013-04-21 | 2 | -172/+171 |
| | | | | | | They had a separate RUN line already, so may as well be in a separate file. llvm-svn: 179988 | ||||
* | Passing arguments to varags functions under the SPARC v9 ABI. | Jakob Stoklund Olesen | 2013-04-21 | 1 | -0/+13 |
| | | | | | | | Arguments after the fixed arguments never use the floating point registers. llvm-svn: 179987 | ||||
* | Fix the SETHIimm pattern for 64-bit code. | Jakob Stoklund Olesen | 2013-04-21 | 1 | -0/+6 |
| | | | | | | Don't ignore the high 32 bits of the immediate. llvm-svn: 179985 | ||||
* | SROA: Don't crash on a select with two identical operands. | Benjamin Kramer | 2013-04-21 | 1 | -0/+11 |
| | | | | | | | This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982 | ||||
* | Revert "SimplifyCFG: If convert single conditional stores" | Arnold Schwaighofer | 2013-04-21 | 1 | -83/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980 | ||||
* | ARM: fix part of test which actually needed an asserts build | Tim Northover | 2013-04-21 | 2 | -6/+30 |
| | | | | | | This should fix a buildbot failure that occurred after r179977. llvm-svn: 179978 | ||||
* | ARM: Use ldrd/strd to spill 64-bit pairs when available. | Tim Northover | 2013-04-21 | 1 | -13/+27 |
| | | | | | | | This allows common sp-offsets to be part of the instruction and is probably faster on modern CPUs too. llvm-svn: 179977 | ||||
* | SLPVectorize: Add support for vectorization of casts. | Nadav Rotem | 2013-04-21 | 1 | -0/+38 |
| | | | | llvm-svn: 179975 | ||||
* | [objc-arc] Cleaned up tail-call-invariant-enforcement.ll. | Michael Gottesman | 2013-04-21 | 1 | -25/+40 |
| | | | | | | | | | | | | Specifically: 1. Added checks that unwind is being properly added to various instructions. 2. Fixed the declaration/calling of objc_release to have a return type of void. 3. Moved all checks to precede the functions and added checks to ensure that the checks would only match inside the specific function that we are attempting to check. llvm-svn: 179973 | ||||
* | [objc-arc] Check that objc-arc-expand properly handles all strictly ↵ | Michael Gottesman | 2013-04-21 | 1 | -5/+71 |
| | | | | | | forwarding calls and does not touch calls which are not strictly forwarding (i.e. objc_retainBlock). llvm-svn: 179972 | ||||
* | [objc-arc] Renamed the test file ↵ | Michael Gottesman | 2013-04-21 | 1 | -0/+0 |
| | | | | | | clang-arc-used-intrinsic-removed-if-isolated.ll -> intrinsic-use-isolated.ll to match the other test file intrinsic-use.ll. llvm-svn: 179971 | ||||
* | Remove tbaa metadata. | Bill Wendling | 2013-04-21 | 1 | -7/+3 |
| | | | | llvm-svn: 179970 | ||||
* | Compile varargs functions for SPARCv9. | Jakob Stoklund Olesen | 2013-04-20 | 1 | -0/+62 |
| | | | | | | | | | | | | With a little help from the frontend, it looks like the standard va_* intrinsics can do the job. Also clean up an old bitcast hack in LowerVAARG that dealt with unaligned double loads. Load SDNodes can specify an alignment now. Still missing: Calling varargs functions with float arguments. llvm-svn: 179961 |