summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Implement AArch64 post-index vector load/store multiple N-element structure ↵Hao Liu2013-11-057-14/+708
| | | | | | | | | | | | class SIMD(lselem-post). Including following 14 instructions: 4 ld1 insts: post-index load multiple 1-element structure to sequential 1/2/3/4 registers. ld2/ld3/ld4: post-index load multiple N-element structure to sequential N registers (N=2,3,4). 4 st1 insts: post-index store multiple 1-element structure from sequential 1/2/3/4 registers. st2/st3/st4: post-index store multiple N-element structure from sequential N registers (N = 2,3,4). llvm-svn: 194043
* Implemented aarch64 neon intrinsic vcopy_lane with float type.Kevin Qin2013-11-052-88/+82
| | | | llvm-svn: 194041
* Revert r194019 to r194021, "Submit the basic port of the rest of ARM ↵NAKAMURA Takumi2013-11-042-1499/+12
| | | | | | | | constant islands code to Mips." It broke -Asserts build. llvm-svn: 194026
* ARM: remove unnecessary state-tracking during frame lowering.Tim Northover2013-11-046-115/+58
| | | | | | | | | | | | | | | | | | | | | ResolveFrameIndex had what appeared to be a very nasty hack for when the frame-index referred to a callee-saved register. In this case it "adjusted" the offset so that the address was correct if (and only if) the MachineInstr immediately followed the respective push. This "worked" for all forms of GPR & DPR but was only ever used to set the frame pointer itself, and once this was put in a more sensible location the entire state-tracking machinery it relied on became redundant. So I stripped it. The only wrinkle is that "add r7, sp, #0" might theoretically be slower (need an actual ALU slot) compared to "mov r7, sp" so I added a micro-optimisation that also makes emitARMRegUpdate and emitT2RegUpdate also work when NumBytes == 0. No test changes since there shouldn't be any functionality change. llvm-svn: 194025
* AArch64: use default asm operand printing when modifier inapplicableTim Northover2013-11-041-47/+59
| | | | | | | | | | | | If an inline assembly operand has multiple constraints (e.g. "Ir" for immediate or register) and an operand modifier (E.g. "w" for "print register as wN") then we need to decide behaviour when the modifier doesn't apply to the constraint. Previousely produced some combination of an assertion failure and a fatal error. GCC's behaviour appears to be to ignore the modifier and print the operand in the default way. This patch should implement that. llvm-svn: 194024
* Make sure we don't get a warning from this variable that is only usedReed Kotler2013-11-041-0/+1
| | | | | | when compiling with DEBUG. llvm-svn: 194021
* Submit the basic port of the rest of ARM constant islands code to Mips. Reed Kotler2013-11-042-12/+1498
| | | | | | | | | | | | | | | | | Two test cases are added which reflect the next level of functionality: constants getting moved to water areas that are out of range from the initial placement at the end of the function and basic blocks being split to create water when none exists that can be used. There is a bunch of this code that is not complete and has been marked with IN_PROGRESS. I will finish cleaning this all up during the next week or two and submit the rest of the test cases. I have elminated some code for dealing with inline assembly because to me it unecessarily complicates things and some of the newer features of llvm like function attributies and builtin assembler give me better tools to solve the alignment issues created there. Also, for Mips16 I even have the option of not doing constant islands in the present of inline assembler if I chose. llvm-svn: 194019
* Check for both styles of clobbers, those produced by dragonegg andEric Christopher2013-11-041-11/+19
| | | | | | | | those produced by clang for the inline asm bswap conversion. Modified from a patch by Chris Smowton. llvm-svn: 194016
* Add support for AVX512 masked vector blend intrinsics.Cameron McInally2013-11-041-15/+39
| | | | llvm-svn: 194006
* Support for microMIPS branch instructions.Zoran Jovanovic2013-11-0411-26/+158
| | | | llvm-svn: 193992
* X86: Add a description for AMD bdver3 aka Steamroller.Benjamin Kramer2013-11-041-0/+8
| | | | | | This is just bdver2 + FSGSBase. llvm-svn: 193984
* AVX-512: added VPCONFLICT instruction and intrinsics,Elena Demikhovsky2013-11-032-1/+119
| | | | | | added EVEX_KZ to tablegen llvm-svn: 193959
* [SparcV9] Handle i64 <-> float conversions in sparcv9 mode.Venkatraman Govindaraju2013-11-034-28/+184
| | | | llvm-svn: 193957
* [Sparc] Expand FP_TO_UINT, UINT_TO_FP for fp128.Venkatraman Govindaraju2013-11-031-3/+42
| | | | llvm-svn: 193947
* Convert calls to __sinpi and __cospi into __sincospi_stretBob Wilson2013-11-031-0/+33
| | | | | | | | | | This adds an SimplifyLibCalls case which converts the special __sinpi and __cospi (float & double variants) into a __sincospi_stret where appropriate to remove duplicated work. Patch by Tim Northover llvm-svn: 193943
* Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+.Bob Wilson2013-11-034-0/+87
| | | | | | | rdar://12856873 Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller llvm-svn: 193942
* [SparcV9] Add ctpop instruction for i64. Also, expand ctlz, cttz and bswap.Venkatraman Govindaraju2013-11-032-0/+9
| | | | llvm-svn: 193941
* Fix PR17764Michael Liao2013-11-021-1/+1
| | | | | | | - When selecting BLEND from vselect, the operands need swapping as due to the difference between vselect and SSE/AVX's BLEND insn llvm-svn: 193900
* Use isa<> instead of dyn_cast<> with unused valueMatt Arsenault2013-11-011-3/+3
| | | | llvm-svn: 193869
* [AArch64] Simplify a few of the instruction patterns. No functional change ↵Chad Rosier2013-11-011-109/+60
| | | | | | intended. llvm-svn: 193867
* [AArch64] Fix assembly string formatting and other coding standard violations.Chad Rosier2013-11-011-190/+118
| | | | llvm-svn: 193866
* Remove linkonce_odr_auto_hide.Rafael Espindola2013-11-011-2/+0
| | | | | | | | | | | | | | | linkonce_odr_auto_hide was in incomplete attempt to implement a way for the linker to hide symbols that are known to be available in every TU and whose addresses are not relevant for a particular DSO. It was redundant in that it all its uses are equivalent to linkonce_odr+unnamed_addr. Unlike those, it has never been connected to clang or llvm's optimizers, so it was effectively dead. Given that nothing produces it, this patch just nukes it (other than the llvm-c enum value). llvm-svn: 193865
* [ARM] Add Virtualization subtarget feature and more build attributes in this ↵Bradley Smith2013-11-015-5/+39
| | | | | | | | | | | | | | | area Add a Virtualization ARM subtarget feature along with adding proper build attribute emission for Tag_Virtualization_use (encodes Virtualization and TrustZone) and Tag_MPextension_use. Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to something that is more maintainable. This changes the focus of this testcase away from testing CPU defaults (which is tested elsewhere), onto specifically testing that attributes are encoded correctly. llvm-svn: 193859
* [ARM] Fix Tag_ABI_HardFP_use build attributeBradley Smith2013-11-012-5/+13
| | | | | | | | Fix Tag_ABI_HardFP_use build attribute to handle single precision FP, replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add some tests for Tag_ABI_VFP_args. llvm-svn: 193856
* Fix unused variable warnings.Dan Gohman2013-10-311-0/+3
| | | | llvm-svn: 193823
* [AArch64] Add support for NEON scalar fixed-point convert to floating-point ↵Chad Rosier2013-10-311-0/+35
| | | | | | instructions. llvm-svn: 193816
* Add new calling convention for WebKit Java Script.Andrew Trick2013-10-311-0/+22
| | | | llvm-svn: 193812
* Add support for stack map generation in the X86 backend.Andrew Trick2013-10-314-5/+178
| | | | | | Originally implemented by Lang Hames. llvm-svn: 193811
* Use StringRef::startswith_lower. No functionality change.Rui Ueyama2013-10-311-4/+4
| | | | llvm-svn: 193796
* [AArch64] Add support for NEON scalar shift immediate instructions.Chad Rosier2013-10-315-1/+404
| | | | llvm-svn: 193790
* SparcV9 doesnt have rem instruction either.Roman Divacky2013-10-311-0/+8
| | | | llvm-svn: 193789
* whitespaceAndrew Trick2013-10-312-10/+10
| | | | llvm-svn: 193765
* Remove another unused flag.Rafael Espindola2013-10-311-1/+0
| | | | llvm-svn: 193756
* Remove unused flag.Rafael Espindola2013-10-311-1/+0
| | | | llvm-svn: 193752
* Add AVX512 unmasked integer broadcast intrinsics and support.Cameron McInally2013-10-311-0/+10
| | | | llvm-svn: 193748
* AVX-512: Implemented CMOV for 512-bit vectorsElena Demikhovsky2013-10-312-2/+24
| | | | llvm-svn: 193747
* [SystemZ] Automatically detect zEC12 and z196 hostsRichard Sandiford2013-10-312-3/+9
| | | | | | | | | | As on other hosts, the CPU identification instruction is priveleged, so we need to look through /proc/cpuinfo. I copied the PowerPC way of handling "generic". Several tests were implicitly assuming z10 and so failed on z196. llvm-svn: 193742
* [AArch64] Make the use of FP instructions optional, but enabled by default.Amara Emerson2013-10-316-28/+90
| | | | | | | This adds a new subtarget feature called FPARMv8 (implied by NEON), and predicates the support of the FP instructions and registers on this feature. llvm-svn: 193739
* Legalize: Improve legalization of long vector extends.Jim Grosbach2013-10-311-55/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727
* Fix a few typosMatt Arsenault2013-10-301-7/+7
| | | | llvm-svn: 193723
* This commit adds some (but not all) of the x86-64 relocations that are notTom Roeder2013-10-301-0/+6
| | | | | | currently supported in the ELF object writer, along with a simple test case. llvm-svn: 193709
* [ARM] NEON instructions were erroneously decoded from certain invalid encodingsArtyom Skrobov2013-10-301-20/+20
| | | | llvm-svn: 193705
* R600: Custom lower f32 = uint_to_fp i64Tom Stellard2013-10-302-0/+23
| | | | llvm-svn: 193701
* Add #include of raw_ostream.h to MipsSEISelLowering.cppHans Wennborg2013-10-301-0/+1
| | | | | | | | Fixing this Windows build error: ..\lib\Target\Mips\MipsSEISelLowering.cpp(997) : error C2027: use of undefined type 'llvm::raw_ostream' llvm-svn: 193696
* [mips][msa] Correct definition of bins[lr] and CHECK-DAG-ize related testsDaniel Sanders2013-10-301-8/+29
| | | | llvm-svn: 193695
* [mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from ↵Daniel Sanders2013-10-303-10/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | normal IR (i.e. not intrinsics) Also corrected the definition of the intrinsics for these instructions (the result register is also the first operand), and added intrinsics for bsel and bseli to clang (they already existed in the backend). These four operations are mostly equivalent to bsel, and bseli (the difference is which operand is tied to the result). As a result some of the tests changed as described below. bitwise.ll: - bsel.v test adapted so that the mask is unknown at compile-time. This stops it emitting bmnzi.b instead of the intended bsel.v. - The bseli.b test now tests the right thing. Namely the case when one of the values is an uimm8, rather than when the condition is a uimm8 (which is covered by bmnzi.b) compare.ll: - bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this is the same operation (see MSA.txt). i8.ll - CHECK-DAG-ized test. - bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands because this is the same operation (see MSA.txt). - bseli.b still emits bseli.b though because the immediate makes it distinguishable from bmnzi.b. vec.ll: - CHECK-DAG-ized test. - bmz.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). - bsel.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). llvm-svn: 193693
* [AArch64] Add support for NEON scalar floating-point compare instructions.Chad Rosier2013-10-303-5/+102
| | | | llvm-svn: 193691
* [mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. ↵Daniel Sanders2013-10-307-27/+163
| | | | | | | | | | | | | | | | | not intrinsics) This required correcting the definition of the bins[lr]i intrinsics because the result is also the first operand. It also required removing the (arbitrary) check for 32-bit immediates in MipsSEDAGToDAGISel::selectVSplat(). Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d because the constant is legalized into a ConstantPool. Similar things can happen with binsri.d with more than 10 bits set in the mask. The resulting code when this happens is correct but not optimal. llvm-svn: 193687
* [mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECTDaniel Sanders2013-10-301-0/+108
| | | | | | | | | | | | | | | (or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b). where $mask is a constant splat. This allows bitwise operations to make use of bsel. It's also a stepping stone towards matching bins[lr], and bins[lr]i from normal IR. Two sets of similar tests have been added in this commit. The bsel_* functions test the case where binsri cannot be used. The binsr_*_i functions will start to use the binsri instruction in the next commit. llvm-svn: 193682
* [mips] MipsSETargetLowering now reports DAGCombiner changes when using ↵Daniel Sanders2013-10-301-1/+9
| | | | | | | | -debug-only=mips-isel No test since -debug output is intended for developers and not end-users. llvm-svn: 193681
OpenPOWER on IntegriCloud