summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Silence some -Asserts uninitialized variable warnings.Daniel Dunbar2010-07-311-2/+2
| | | | llvm-svn: 109956
* MC: Remove HasAbsolutizedSet from WindowsX86AsmBackend.Michael J. Spencer2010-07-311-1/+0
| | | | llvm-svn: 109949
* Move newlines before inline jumptables from the asm strings in .td files toBob Wilson2010-07-314-9/+9
| | | | | | | the jtblock_operand print methods. This avoids extra newlines in the disassembler's output. PR7757. llvm-svn: 109948
* Add relax all support to the COFF object streamer.Michael J. Spencer2010-07-311-1/+1
| | | | llvm-svn: 109947
* Add support for disassembling VMVN (immediate) instructions. PR7747.Bob Wilson2010-07-311-0/+4
| | | | llvm-svn: 109946
* Add -disable-shifter-op to disable isel of shifter ops. On Cortex-a9 the ↵Evan Cheng2010-07-301-0/+11
| | | | | | shifts cost extra instructions so it might be better to emit them separately to take advantage of dual-issues. llvm-svn: 109934
* Add a check in the ARM disassembler for NEON instructions that wouldBob Wilson2010-07-301-5/+9
| | | | | | | reference registers past the end of the NEON register file, and report them as invalid instead of asserting when trying to print them. PR7746. llvm-svn: 109933
* PPC doesn't supported VLA with large alignment. This wasDale Johannesen2010-07-301-2/+2
| | | | | | | | formerly rejected by the FE, so asserted in the BE; now the FE only warns, so we treat it as a legitimate fatal error in PPC BE. This means the test for the feature won't pass, so it's xfail'd. llvm-svn: 109892
* Add the __TEXT,__StaticInit section to the list of sections emitted at theBob Wilson2010-07-301-0/+6
| | | | | | | beginning on ARM Darwin assembly files so that it won't be placed after debug sections. Radar 8252813. llvm-svn: 109879
* Support all 128-bit AVX vector intrinsics. Most part of them I alreadyBruno Cardoso Lopes2010-07-303-232/+240
| | | | | | | | | | | declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878
* Fix typo!Bruno Cardoso Lopes2010-07-301-8/+8
| | | | llvm-svn: 109877
* Many Thumb2 instructions can reference the full ARM register set (i.e.,Jim Grosbach2010-07-306-331/+465
| | | | | | | | | | | | | | | | | | | | | | | | have 4 bits per register in the operand encoding), but have undefined behavior when the operand value is 13 or 15 (SP and PC, respectively). The trivial coalescer in linear scan sometimes will merge a copy from SP into a subsequent instruction which uses the copy, and if that instruction cannot legally reference SP, we get bad code such as: mls r0,r9,r0,sp instead of: mov r2, sp mls r0, r9, r0, r2 This patch adds a new register class for use by Thumb2 that excludes the problematic registers (SP and PC) and is used instead of GPR for those operands which cannot legally reference PC or SP. The trivial coalescer explicitly requires that the register class of the destination for the COPY instruction contain the source register for the COPY to be considered for coalescing. This prevents errant instructions like that above. PR7499 llvm-svn: 109842
* Add builtins for ssat/usat, similar to RealView's __ssat and __usat intrinsics.Nate Begeman2010-07-292-0/+6
| | | | llvm-svn: 109813
* Refactor ARM-specific DAG combining in preparation for adding some moreBob Wilson2010-07-291-12/+25
| | | | | | transformations. llvm-svn: 109800
* Implement vector constants which are splat ofDale Johannesen2010-07-291-8/+62
| | | | | | | | | integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). llvm-svn: 109799
* Don't assert on an unrecognized BrMiscFrm instruction.Bob Wilson2010-07-291-1/+0
| | | | | | PR7745. llvm-svn: 109788
* Add intrinsics __builtin_arm_qadd & __builtin_arm_qsub to allow access to ↵Nate Begeman2010-07-292-9/+14
| | | | | | | | the QADD & QSUB instructions. Behave identically to __qadd & __qsub RealView instruction intrinsics. llvm-svn: 109770
* Revert r109652, and remove the offending assert in loadRegFromStackSlot instead.Jakob Stoklund Olesen2010-07-292-6/+1
| | | | | | | | | | | | We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764
* ARM mode version of r109693. Remove incorrect substitution pattern for ↵Jim Grosbach2010-07-281-2/+6
| | | | | | UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138 llvm-svn: 109696
* Remove incorrect substitution pattern for UXTB16. It wrongly assumed the ↵Jim Grosbach2010-07-281-2/+6
| | | | | | input shift was actually a rotate. rdar://8240138 llvm-svn: 109693
* Remove dead prototypeJim Grosbach2010-07-281-1/+0
| | | | llvm-svn: 109691
* Create a fixed stack object for varargs that is as large as any register.Jakob Stoklund Olesen2010-07-281-1/+4
| | | | | | | | | | The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652
* Fix this code to avoid decrementing an iterator past the beginningDan Gohman2010-07-281-5/+2
| | | | | | of a std::vector. llvm-svn: 109597
* Do GEP offset calculations with unsigned math rather than signed mathDan Gohman2010-07-281-1/+1
| | | | | | to avoid undefined behavior on overflow, noticed by John Regehr. llvm-svn: 109594
* Implement a vectorized algorithm for <16 x i8> << <16 x i8>Nate Begeman2010-07-281-21/+73
| | | | | | This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566
* ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller ↵Nate Begeman2010-07-272-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549
* Make MC use Windows COFF on Windows and add tests.Michael J. Spencer2010-07-272-0/+23
| | | | llvm-svn: 109494
* The isLoadFromStackSlot and isStoreToStackSlot have no way of reportingJakob Stoklund Olesen2010-07-271-2/+3
| | | | | | | | | | | | | subregister operands like this: %reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8) Make them return false when subreg operands are present. VirtRegRewriter is making bad assumptions otherwise. This fixes PR7713. llvm-svn: 109489
* Add assertions that expose the PR7713 miscompilation: Accessing a stack slotJakob Stoklund Olesen2010-07-271-0/+4
| | | | | | with a too-big register class. llvm-svn: 109488
* And a bit more non-ASCII stuff.Eli Friedman2010-07-261-1/+1
| | | | llvm-svn: 109458
* Drop some non-ascii stuffAnton Korobeynikov2010-07-261-7/+7
| | | | llvm-svn: 109456
* On x86, f32 / f64 nodes share the same registers as 128-bit vector values.Evan Cheng2010-07-262-0/+30
| | | | llvm-svn: 109450
* Add a noteAnton Korobeynikov2010-07-261-0/+21
| | | | llvm-svn: 109448
* Temporary hack to let codegen assert or generate poor code in caseBruno Cardoso Lopes2010-07-261-8/+13
| | | | | | | | | we are using AVX and no AVX version of the desired intruction is present, this is better for incremental dev (without fallbacks it's easier to spot what's missing). Not sure this is the best hack thought (we can also disable all HasSSE* predicates by dinamically marking them 'false' if AVX is present) llvm-svn: 109434
* Currently EH lowering code expects typeinfo to be global only.Anton Korobeynikov2010-07-261-2/+11
| | | | | | | | This assumption is not satisfied due to global mergeing. Workaround the issue by temporary disablinge mergeing of const globals. Also, ignore LLVM "special" globals. This fixes PR7716 llvm-svn: 109423
* ARM fastisel isn't ready.Evan Cheng2010-07-261-1/+2
| | | | llvm-svn: 109421
* Remove extraneous semicolonDouglas Gregor2010-07-251-1/+1
| | | | llvm-svn: 109373
* Unbreak CMake buildDouglas Gregor2010-07-251-0/+1
| | | | llvm-svn: 109372
* Hook in GlobalMerge passAnton Korobeynikov2010-07-246-1/+222
| | | | llvm-svn: 109359
* Add an ILP scheduler. This is a register pressure aware scheduler that'sEvan Cheng2010-07-242-0/+21
| | | | | | | | | | | | appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300
* Support x86 "eiz" and "riz" pseudo index registers in the assembler.Bruno Cardoso Lopes2010-07-243-1/+25
| | | | llvm-svn: 109295
* Use the appropriate register class for an i32 when adding ARM::LR to theJim Grosbach2010-07-231-1/+1
| | | | | | | function live in set. This will give us tGPR for Thumb1 and GPR otherwise, so the copy will be spillable. rdar://8224931 llvm-svn: 109293
* Revert 109076. It is wrong and was causing regressions. Add someDale Johannesen2010-07-231-18/+48
| | | | | | | | | | comments explaining why it was wrong. 8225024. Fix the real problem in 8213383: the code that splits very large blocks when no other place to put constants can be found was not considering the case that the block contained a Thumb tablejump. llvm-svn: 109282
* - Allow target to specify when is register pressure "too high". In most cases,Evan Cheng2010-07-232-0/+24
| | | | | | | | | | | | | it's too late to start backing off aggressive latency scheduling when most of the registers are in use so the threshold should be a bit tighter. - Correctly handle live out's and extract_subreg etc. - Enable register pressure aware scheduling by default for hybrid scheduler. For ARM, this is almost always a win on # of instructions. It's runtime neutral for most of the tests. But for some kernels with high register pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by 54 and sped up by 20%. llvm-svn: 109279
* Remove trailing whitespaceBruno Cardoso Lopes2010-07-231-30/+30
| | | | llvm-svn: 109276
* Add AVX version of CLMUL instructionsBruno Cardoso Lopes2010-07-233-0/+58
| | | | llvm-svn: 109248
* fix constness warningsGabor Greif2010-07-232-2/+4
| | | | llvm-svn: 109224
* do not (implicitly) dereference iterator many times, cache it insteadGabor Greif2010-07-231-2/+3
| | | | llvm-svn: 109222
* Declare CLMUL as a subtarget featureBruno Cardoso Lopes2010-07-231-0/+2
| | | | llvm-svn: 109207
* Add x86 CLMUL (Carry-less multiplication) cpu featureBruno Cardoso Lopes2010-07-233-3/+10
| | | | llvm-svn: 109206
OpenPOWER on IntegriCloud