summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Add an x86 prefix encoding for instructions that would decode to a different ↵Craig Topper2014-02-188-175/+175
| | | | | | instruction with 0xf2/f3/66 were in front of them, but don't themselves have a prefix. For now this doesn't change any bbehavior, but plan to use it to fix some bugs in the disassembler. llvm-svn: 201538
* Fix the arm assembler so that this malformed instruction:Kevin Enderby2014-02-171-1/+2
| | | | | | | | | | | | | | | | | | ldrd r6, r7 [r2, #15] simply gives an error and does not triggers an assertion. As Jim points out, the diagnostic is really strange here, but fixing that would be more complicated. The missing comma results in the parser expecting a construct like r2[2], which is the vector index thing the error message is talking about. That's not what the user intended, though, and there's nothing else in the instruction that looks at all like a vector. Yet more fallout from not having a real parser here and trying to do context-free generic matching for addressing modes. rdar://15097243 llvm-svn: 201531
* Fix diassembler handling of rex.b when mod=00/01/10 and bbb=101. Mod=00 ↵Craig Topper2014-02-171-4/+3
| | | | | | should ignore the base register entirely. Mod=01/10 should treat this as R13 plus displacment. Fixes PR18860. llvm-svn: 201507
* AVX-512: implemented zext fron i1 to i16Elena Demikhovsky2014-02-171-1/+3
| | | | llvm-svn: 201502
* Use 16 byte stack alignment for NaCl on ARMMark Seaborn2014-02-163-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NaCl's ARM ABI uses 16 byte stack alignment, so set that in ARMSubtarget.cpp. Using 16 byte alignment exposes an issue in code generation in which a varargs function leaves a 4 byte gap between the values of r1-r3 saved to the stack and the following arguments that were passed on the stack. (Previously, this code only needed to support 4 byte and 8 byte alignment.) With this issue, llc generated: varargs_func: sub sp, sp, #16 push {lr} sub sp, sp, #12 add r0, sp, #16 // Should be 20 stm r0, {r1, r2, r3} ldr r0, .LCPI0_0 // Address of va_list add r1, sp, #16 str r1, [r0] bl external_func Fix the bug by checking for "Align > 4". Also simplify the code by using OffsetToAlignment(), and update comments. Differential Revision: http://llvm-reviews.chandlerc.com/D2677 llvm-svn: 201497
* Remove dead code, we already require cmake 2.8.8.Rafael Espindola2014-02-161-5/+0
| | | | llvm-svn: 201495
* AVX-512: simpyfied BUILD_VECTOR for masks; fixed cmp/test sequenceElena Demikhovsky2014-02-162-70/+28
| | | | llvm-svn: 201487
* ARM IAS: (partially) support .arch_extension directiveSaleem Abdulrasool2014-02-161-0/+82
| | | | | | | | | | | | This adds a partial implementation of the .arch_extension directive to the integrated ARM assembler. There are a number of limitations to this implementation arising from the target backend support rather than the implementation itself. Namely, iWMMXT (v1 and v2), Maverick, and XScale support is not present in the ARM backend. Currently, there is no check for A-class only (needed for virt), and no ARMv6k detection (needed for os and sec). The remainder of the extensions are fully supported. llvm-svn: 201471
* Add opcode extension forms of MOV8ri/MOV16ri/MOV32ri.Craig Topper2014-02-151-0/+10
| | | | llvm-svn: 201463
* This patch has two main functions:Reed Kotler2014-02-149-12/+453
| | | | | | | | | | | | | | | | | 1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc. 2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more logical place to handle this and we have known for some time that we needed to move the code later and not implement it using inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list during the original putback of the Mips16HardFloat pass but was not practical for us do at that time. llvm-svn: 201426
* Generate the DWARF stack frame decode operations in the function prologue ↵Artyom Skrobov2014-02-142-37/+271
| | | | | | | | for ARM/Thumb functions. Patch by Keith Walker! llvm-svn: 201423
* [AArch64 NEON] Fix a bug to avoid using floating type as condition type in ↵Kevin Qin2014-02-141-11/+6
| | | | | | lowering SELECT_CC. llvm-svn: 201395
* Enable AArch64 NEON by default.Jiangning Liu2014-02-141-1/+1
| | | | llvm-svn: 201385
* [AArch64]Fix the assertion failure caused by "v1i1 SETCC" DAG node.Hao Liu2014-02-141-0/+90
| | | | | | As v1i1 is illegal, the type legalizer tries to scalarize such node. But if the type operands of SETCC is legal, the scalarization algorithm will cause an assertion failure. llvm-svn: 201381
* [X86] Don't mark movabsq as cheap-as-move - it isn't that cheap.Juergen Ributzka2014-02-141-3/+5
| | | | | | | | | A simple register copy on X86 is just 3 bytes, whereas movabsq is a 10 byte instruction. Marking movabsq as not beeing cheap will allow LICM to move it out of the loop and it also prevents unnecessary rematerializations if the value is needed in more than one register. llvm-svn: 201377
* R600/SI: Expand all v8[if]32 operationsTom Stellard2014-02-133-1/+37
| | | | llvm-svn: 201371
* R600/SI: Add a pattern for i32 anyextTom Stellard2014-02-131-2/+5
| | | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 201370
* R600/SI: Completely Disable TypeRewriter on computeTom Stellard2014-02-131-3/+3
| | | | llvm-svn: 201369
* R600/SI: Split global vector loads with more than 4 elementsTom Stellard2014-02-131-3/+5
| | | | llvm-svn: 201368
* Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵Daniel Sanders2014-02-137-3/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333
* ARM: remove floating-point patterns for @llvm.arm.neon.vabsTim Northover2014-02-131-3/+0
| | | | | | | The front-end is now generating the generic @llvm.fabs for this operation now, so the extra patterns are no longer needed. llvm-svn: 201314
* Add Cortex-A53 and Cortex-A57 cores to the AArch64 backendOliver Stannard2014-02-131-0/+11
| | | | llvm-svn: 201305
* [AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 ↵Hao Liu2014-02-131-4/+160
| | | | | | | | types. As this problems are similar to shl/sra/srl, also add patterns for shift nodes. llvm-svn: 201298
* [AArch64]Add support for spilling FPR8/FPR16.Hao Liu2014-02-131-0/+8
| | | | llvm-svn: 201287
* [Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo calledAndrea Di Biagio2014-02-121-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | 'OK_NonUniformConstValue' to identify operands which are constants but not constant splats. The cost model now allows returning 'OK_NonUniformConstValue' for non splat operands that are instances of ConstantVector or ConstantDataVector. With this change, targets are now able to compute different costs for instructions with non-uniform constant operands. For example, On X86 the cost of a vector shift may vary depending on whether the second operand is a uniform or non-uniform constant. This patch applies the following changes: - The cost model computation now takes into account non-uniform constants; - The cost of vector shift instructions has been improved in X86TargetTransformInfo analysis pass; - BBVectorize, SLPVectorizer and LoopVectorize now know how to distinguish between non-uniform and uniform constant operands. Added a new test to verify that the output of opt '-cost-model -analyze' is valid in the following configurations: SSE2, SSE4.1, AVX, AVX2. llvm-svn: 201272
* [X86] Teach the backend how to lower vector shift left into multiply rather ↵Andrea Di Biagio2014-02-121-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | than scalarizing it. Instead of expanding a packed shift into a sequence of scalar shifts, the backend now tries (when possible) to convert the vector shift into a vector multiply. Before this change, a shift of a MVT::v8i16 vector by a build_vector of constants was always scalarized into a long sequence of "vector extracts + scalar shifts + vector insert". With this change, if there is SSE2 support, we emit a single vector multiply. This change also affects SSE4.1, AVX, AVX2 shifts: - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants is now lowered when possible into a single SSE4.1 vector multiply. - Packed v16i16 shift left by constant build_vector are now expanded when possible into a single AVX2 vpmullw. This change also improves the lowering of AVX512f vector shifts. Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected by this change. llvm-svn: 201271
* Revert r201237+r201238: Demote EmitRawText call in ↵Daniel Sanders2014-02-127-28/+3
| | | | | | | | AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241
* Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵Daniel Sanders2014-02-127-3/+28
| | | | | | | | | | | | | | | | | | | | | hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237
* R600: Always implement both versions of isTruncateFree and add a sanity check.Benjamin Kramer2014-02-122-5/+12
| | | | llvm-svn: 201222
* Mark XACQUIRE_PREFIX/XRELEASE_PREFIX as isAsmParserOnly so they'll disappear ↵Craig Topper2014-02-121-1/+2
| | | | | | from the disassembler table build without custom filtering code. llvm-svn: 201215
* Tweak ARM fastcc by adopting these two AAPCS rules:Evan Cheng2014-02-111-0/+7
| | | | | | | | | | | | | | * CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers * When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g. 7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various calling conventions. rdar://16039676 llvm-svn: 201193
* R600/SI: Fix assertion on infinite loops.Matt Arsenault2014-02-111-2/+4
| | | | | | | This isn't the most useful case to fix in the real world, but bugpoint runs into this. llvm-svn: 201177
* ARM: Thumb2 LDR(literal) can target SP.Jim Grosbach2014-02-111-1/+1
| | | | | | | | | Fix a slightly overzealous destination register restriction for the 'without .w' alias. Add some explicit testcases. rdar://16033140 llvm-svn: 201173
* XCore target: fix const section handlingRobert Lytton2014-02-113-28/+40
| | | | | | | | | | | | Xcore target ABI requires const data that is externally visible to be handled differently if it has C-language linkage rather than C++ language linkage. Clang now emits ".cp.rodata" section information. All other externally visible constant data will be placed in the DP section. llvm-svn: 201144
* XCore target: Lower ATOMIC_LOAD & ATOMIC_STORERobert Lytton2014-02-112-0/+70
| | | | llvm-svn: 201143
* AVX: fixed a bug in LowerVECTOR_SHUFFLEElena Demikhovsky2014-02-111-1/+5
| | | | llvm-svn: 201140
* AVX-512: Optimized BUILD_VECTOR pattern; Elena Demikhovsky2014-02-112-4/+6
| | | | | | fixed encoding of VEXTRACTPS instruction. llvm-svn: 201134
* R600: Implement isTruncateFreeMatt Arsenault2014-02-102-0/+6
| | | | | | | Truncation is just accessing a subregister for any multiple of the register size, so it's free. llvm-svn: 201107
* R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are usedTom Stellard2014-02-104-10/+28
| | | | | | | | | | | DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097
* R600/SI: Only use S_WQM_B64 in pixel shadersTom Stellard2014-02-101-1/+1
| | | | | | | | This doesn't change any functionality, since we only have two shader types (compute and pixel) that use local memory. We're just changing the logic to match the documentation. llvm-svn: 201096
* ARM: use natural LLVM IR for vshll instructionsTim Northover2014-02-103-36/+27
| | | | | | | | Similarly to the vshrn instructions, these are simple zext/sext + trunc operations. Using normal LLVM IR should allow for better code, and more sharing with the AArch64 backend. llvm-svn: 201093
* [AArch64] Handle aliases of conditional branches without b.pred form.Chad Rosier2014-02-101-4/+25
| | | | llvm-svn: 201091
* ARM: r12 is callee-saved for interrupt handlersOliver Stannard2014-02-101-2/+2
| | | | | | | For A- and R-class processors, r12 is not normally callee-saved, but is for interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker". llvm-svn: 201089
* ARM: use LLVM IR to represent the vshrn operationTim Northover2014-02-104-14/+17
| | | | | | | | | | vshrn is just the combination of a right shift and a truncate (and the limits on the immediate value actually mean the signedness of the shift doesn't matter). Using that representation allows us to get rid of an ARM-specific intrinsic, share more code with AArch64 and hopefully get better code out of the mid-end optimisers. llvm-svn: 201085
* [mips][msa] Add DLSA instruction.Matheus Almeida2014-02-103-1/+23
| | | | llvm-svn: 201081
* [mips][msa] Make LSA_DESC a parameterizable class.Matheus Almeida2014-02-101-7/+11
| | | | | | | | | This way it's possible to share the instruction's description for LSA and DLSA (to be added). No functional changes. llvm-svn: 201078
* AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors.Elena Demikhovsky2014-02-104-10/+12
| | | | llvm-svn: 201066
* Recommit r201059 and r201060 with hopefully a fix for its original failure.Craig Topper2014-02-107-11/+31
| | | | | | | | | | Original commits messages: Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code. Simplify a bunch of code by removing the need for the x86 disassembler table builder to know about extended opcodes. The modrm forms are sufficient to convey the information. llvm-svn: 201065
* Revert r201059 and r201060.Bob Wilson2014-02-107-29/+11
| | | | | | | | r201059 appears to cause a crash in a bootstrapped build of clang. Craig isn't available to look at it right now, so I'm reverting it while he investigates. llvm-svn: 201064
* [AArch64]Implement the copy of two FPR8 registers by using FMOVss of two ↵Hao Liu2014-02-101-0/+10
| | | | | | FPR32 registers in copyPhysReg. llvm-svn: 201061
OpenPOWER on IntegriCloud