summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [mips] Simplify LowerFormalArguments using getRegClassFor.Akira Hatanaka2013-10-281-15/+2
| | | | | | No functionality change. llvm-svn: 193540
* Return early from getUnconditionalBranchTargetOpValue if the branch target isLang Hames2013-10-281-1/+1
| | | | | | | | | | | | | | | | | an MCExpr, in order to avoid writing an encoded zero value in the immediate field. When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we don't know what the final immediate field value should be. We shouldn't explicitly set the immediate field to an encoded zero value as zero is encoded with a non-zero bit pattern. This leads to bits being set that pollute the final immediate value. The nature of the encoding is such that the polluted bits only affect very large immediate values, explaining why this hasn't caused problems earlier. Fixes <rdar://problem/15155975>. llvm-svn: 193535
* [arm] Implement eabi_attribute, cpu, and fpu directives.Logan Chien2013-10-287-265/+514
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit allows the ARM integrated assembler to parse and assemble the code with .eabi_attribute, .cpu, and .fpu directives. To implement the feature, this commit moves the code from AttrEmitter to ARMTargetStreamers, and several new test cases related to cortex-m4, cortex-r5, and cortex-a15 are added. Besides, this commit also change the Subtarget->isFPOnlySP() to Subtarget->hasD16() to match the usage of .fpu directive. This commit changes the test cases: * Several .eabi_attribute directives in 2010-09-29-mc-asm-header-test.ll are removed because the .fpu directive already cover the functionality. * In the Cortex-A15 test case, the value for Tag_Advanced_SIMD_arch has be changed from 1 to 2, which is more precise. llvm-svn: 193524
* [SystemZ] Set usaAA to trueRichard Sandiford2013-10-281-0/+3
| | | | | | | | | | | | | | | | useAA significantly improves the handling of vector code that has TBAA information attached. It also helps other cases, as shown by the testsuite changes here. The only real downside I've seen is that it interferes with MergeConsecutiveStores. The problem is that that optimization works top down, starting at the first store in the chain, and looks for cases where the chain result is only used by a single related store. These related stores don't alias, so useAA will have rewritten all the later stores to use a different chain input (typically the same one as the first store). I think the advantages outweigh the disadvantages though, so for now I've just disabled alias analysis for the unaligned-01.ll test. llvm-svn: 193521
* Prune utf8 chars in comments.NAKAMURA Takumi2013-10-282-5/+5
| | | | llvm-svn: 193512
* Prune trailing linefeeds.NAKAMURA Takumi2013-10-282-2/+0
| | | | llvm-svn: 193511
* Target/R600: Un-tab-ify.NAKAMURA Takumi2013-10-283-9/+9
| | | | llvm-svn: 193510
* Make first substantial checkin of my port of ARM constant islands code to Mips.Reed Kotler2013-10-278-12/+284
| | | | | | | | | | | | Before I just ported the shell of the pass. I've tried to keep everything nearly identical to the ARM version. I think it will be very easy to eventually merge these two and create a new more general pass that other targets can use. I have some improvements I would like to make to allow pools to be shared across functions and some other things. When I'm all done we can think about making a more general pass. More to be ported but the basic mechanism works now almost as good as gcc mips16. llvm-svn: 193509
* NVPTX: Remove unused globals.Benjamin Kramer2013-10-271-7/+3
| | | | llvm-svn: 193500
* Hexagon: Remove global state.Benjamin Kramer2013-10-271-10/+25
| | | | llvm-svn: 193499
* AVX-512: PMIN/PMAX intrinsics and patternsElena Demikhovsky2013-10-272-1/+45
| | | | | | Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193497
* [X86][AVX512] Add patterns that match the AVX512 floating point register ↵Quentin Colombet2013-10-251-0/+5
| | | | | | | | vbroadcast intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193422
* [X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast ↵Quentin Colombet2013-10-251-0/+5
| | | | | | | | intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193421
* ARM: allow .thumb_func to be separated from symbol definitionTim Northover2013-10-251-17/+18
| | | | | | | | | | When assembling, a .thumb_func directive is supposed to be applicable to the next symbol definition, even if there are intervening directives. We were racing ahead to try and find it, and this commit should fix the issue. Patch by Gabor Ballabas llvm-svn: 193403
* ARM: don't expand atomicrmw inline on Cortex-M0Tim Northover2013-10-252-9/+13
| | | | | | | | | | There's a barrier instruction so that should still be used, but most actual atomic operations are going to need a platform decision on the correct behaviour (either nop if single-threaded or OS-support otherwise). rdar://problem/15287210 llvm-svn: 193399
* Optimize concat_vectors(X, undef) -> scalar_to_vector(X).Nadav Rotem2013-10-252-47/+5
| | | | | | | This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393
* ARM: Tweak usage of '*vfp' compiler_rt functions.Jim Grosbach2013-10-241-1/+2
| | | | | | | | | Only use them if the subtarget has ARM mode, as these routines are implemented as ARM code. rdar://15302004 llvm-svn: 193381
* Remove class abstraction from ARM struct byval loweringDavid Peixotto2013-10-241-553/+262
| | | | | | | | | | | This commit changes the struct byval lowering for arm to use inline checks for the subtarget instead of a class abstraction to represent the differences. The class abstraction was judged to be too much code for this task. No intended functionality change. llvm-svn: 193357
* ARM: Mark double-precision instructions as suchTim Northover2013-10-243-45/+66
| | | | | | | | | | | | This prevents us from silently accepting invalid instructions on (for example) Cortex-M4 with just single-precision VFP support. No tests for the extra Pat Requires because they're essentially assertions: the affected code should have been lowered to libcalls before ISel. rdar://problem/15302004 llvm-svn: 193354
* ARM: add a couple more NEON predicates.Tim Northover2013-10-241-4/+4
| | | | | | | | The fused multiply instructions were added in VFPv4 but are still NEON instructions, in particular they shouldn't be available on a Cortex-M4 not matter how floaty it is. llvm-svn: 193342
* ARM: mark various aliases with their architecture requirements.Tim Northover2013-10-242-8/+12
| | | | | | | | | | If an alias inherits directly from InstAlias then it doesn't get any default "Requires" values, so llvm-mc will allow it even on architectures that don't support the underlying instruction. This tidies up the obvious VFP and NEON cases I found. llvm-svn: 193340
* ARM: Use non-VFP softcalls on embedded Darwinish targetsTim Northover2013-10-241-1/+1
| | | | | | | | | | | | | The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1 code to make use of VFP instructions by switching back to ARM mode, they make no sense for M-class processors which don't even have an ARM mode. Given that justification, in practice this is a platform ABI decision so the actual check is based on that rather than CPU features. rdar://problem/15302004 llvm-svn: 193327
* ARM: fix assert on unpredictable POP instruction.Tim Northover2013-10-241-3/+2
| | | | | | | | | | | POP instructions are aliased to the ARM LDM variants but have different syntax. This caused two problems: we tried to access a non-existent operand to annotate the '!', and the error message didn't make much sense. With some vigorous hand-waving in the error message both problems can be fixed. llvm-svn: 193322
* Make sure SP is always aligned on a 2 byte boundaryJob Noorman2013-10-241-2/+2
| | | | llvm-svn: 193320
* [AArch64] Fix NZCV reg live-in bug in F128CSEL codegen.Amara Emerson2013-10-241-2/+6
| | | | | | | | | When generating the IfTrue basic block during the F128CSEL pseudo-instruction handling, the NZCV live-in for the newly created BB wasn't being added. This caused a fault during MI-sched/live range calculation when the predecessor for the fall-through BB didn't have a live-in for phys-reg as expected. llvm-svn: 193316
* AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsicsElena Demikhovsky2013-10-241-0/+36
| | | | llvm-svn: 193312
* (this is a corrected patch)Yaron Keren2013-10-233-2/+4
| | | | | | | | | | | | | | Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk, functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows OS (both Windows target and MingW target) but not Mach-O object format: Looks like macho environment was used to build some EFI code. Credits to Andrew MacPherson. llvm-svn: 193289
* Revert "Calling _chkstk is required on ELF as well as COFF on Windows. ↵Rafael Espindola2013-10-233-4/+2
| | | | | | | | | | Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows." This reverts commit r193263. It is causing CodeGen/X86/mingw-alloca.ll to fail. llvm-svn: 193275
* X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate.Benjamin Kramer2013-10-233-6/+14
| | | | | | Also update the cost model. llvm-svn: 193270
* Calling _chkstk is required on ELF as well as COFF on Windows. Yaron Keren2013-10-233-2/+4
| | | | | | | | | | Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows. Credits to Andrew MacPherson. llvm-svn: 193263
* X86: Custom lower zext v16i8 to v16i16.Benjamin Kramer2013-10-232-19/+8
| | | | | | | | | | | | | | | | | On sandy bridge (PR17654) we now get vpxor %xmm1, %xmm1, %xmm1 vpunpckhbw %xmm1, %xmm0, %xmm2 vpunpcklbw %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm2, %ymm0, %ymm0 On haswell it's a simple vpmovzxbw %xmm0, %ymm0 There is a maze of duplicated and dead transforms and patterns in this area. Remove the dead custom lowering of zext v8i16 to v8i32, that's already handled by LowerAVXExtend. llvm-svn: 193262
* Fix PR17631Michael Liao2013-10-231-1/+10
| | | | | | | | | - Skip instructions added in prolog. For specific targets, prolog may insert helper function calls (e.g. _chkstk will be called when there're more than 4K bytes allocated on stack). However, these helpers don't use/def YMM/XMM registers. llvm-svn: 193261
* X86: Make concat_vectors combine a bit more conservative.Jim Grosbach2013-10-231-0/+6
| | | | | | Per Nadav's review comments for r192866. llvm-svn: 193252
* Support for microMIPS relocations 1.Zoran Jovanovic2013-10-234-13/+107
| | | | llvm-svn: 193247
* [mips][msa] Direct Object Emission support for the LSA instruction.Matheus Almeida2013-10-232-8/+21
| | | | llvm-svn: 193240
* [mips][msa] Added support for matching fexp2 from normal IR (i.e. not ↵Daniel Sanders2013-10-233-4/+94
| | | | | | intrinsics) llvm-svn: 193239
* Make ARM hint ranges consistent, and add tests for these rangesArtyom Skrobov2013-10-234-6/+26
| | | | llvm-svn: 193238
* R600/SI: Replace ffs(x) - 1 with countTrailingZeros(x)Tom Stellard2013-10-231-1/+1
| | | | | | ffs(x) broke the mingw buildbot. llvm-svn: 193225
* R600/SI: fix MIMG writemask adjustementTom Stellard2013-10-231-6/+21
| | | | | | | | | | | | This fixes piglit: - shaders/glsl-fs-texture2d-masked - shaders/glsl-fs-texture2d-masked-4 Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 193222
* R600: Fix handling of vector kernel argumentsTom Stellard2013-10-239-35/+131
| | | | | | | | | | The SelectionDAGBuilder was promoting vector kernel arguments to legal types, but this won't work for R600 and SI since kernel arguments are stored in memory and can't be promoted. In order to handle vector arguments correctly we need to look at the original types from the LLVM IR function. llvm-svn: 193215
* R600/SI: Add support for i64 bitwise orTom Stellard2013-10-231-0/+19
| | | | llvm-svn: 193213
* R600/SI: Use S_LOAD_DWORD instructions for v8i32 and v16i32Tom Stellard2013-10-233-2/+8
| | | | llvm-svn: 193212
* [X86][FastISel] Add a comment to help understanding changes made in r192636.Quentin Colombet2013-10-221-0/+23
| | | | | | <rdar://problem/15192473> llvm-svn: 193199
* R600/SI: Don't assert on SCC usageMatt Arsenault2013-10-221-0/+2
| | | | llvm-svn: 193198
* ARM: provide diagnostics on more writeback LDM/STM instructionsTim Northover2013-10-222-17/+41
| | | | | | | | | | | | | | The set of circumstances where the writeback register is allowed to be in the list of registers is rather baroque, but I think this implements them all on the assembly parsing side. For disassembly, we still warn about an ARM-mode LDM even if the architecture revision is < v7 (the required architecture information isn't available). It's a silly instruction anyway, so hopefully no-one will mind. rdar://problem/15223374 llvm-svn: 193185
* R600/SI: Use llvm_unreachable() for an always false assertTom Stellard2013-10-221-2/+1
| | | | llvm-svn: 193183
* R600/SI: Fix warning on non-asserts buildTom Stellard2013-10-221-0/+1
| | | | llvm-svn: 193180
* R600: Simplify handling of private address spaceTom Stellard2013-10-2213-436/+95
| | | | | | | | | | | | | | | | | | The AMDGPUIndirectAddressing pass was previously responsible for lowering private loads and stores to indirect addressing instructions. However, this pass was buggy and way too complicated. The only advantage it had over the new simplified code was that it saved one instruction per direct write to private memory. This optimization likely has a minimal impact on performance, and we may be able to duplicate it using some other transformation. For the private address space, we now: 1. Lower private loads/store to Register(Load|Store) instructions 2. Reserve part of the register file as 'private memory' 3. After regalloc lower the Register(Load|Store) instructions to MOV instructions that use indirect addressing. llvm-svn: 193179
* R600: Remove unused InstrInfo::getMovImmInstr() functionTom Stellard2013-10-225-31/+0
| | | | llvm-svn: 193178
* [mips][msa] Direct Object Emission support for conditional branches.Matheus Almeida2013-10-222-25/+43
| | | | | | | | | | | | These branches have a 16-bit offset (R_MIPS_PC16). List of conditional branch instructions: bnz.{b,h,w,d} bnz.v bz.{b,h,w,d} bz.v llvm-svn: 193157
OpenPOWER on IntegriCloud