summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [Hexagon] Adding unsigned halfword load.Colin LeMahieu2014-12-235-20/+18
| | | | llvm-svn: 224772
* [mips][microMIPS] Implement LWSP and SWSP instructionsJozef Kolek2014-12-236-0/+90
| | | | | | Differential Revision: http://reviews.llvm.org/D6416 llvm-svn: 224771
* AVX-512: Added FMA instructions, intrinsics an tests for KNL and SKX targetsElena Demikhovsky2014-12-233-81/+101
| | | | | | | | by Asaf Badouh http://reviews.llvm.org/D6456 llvm-svn: 224764
* [PowerPC] Don't mark the return-address slot as immutableHal Finkel2014-12-231-1/+1
| | | | | | | | | | | | | It is tempting to mark the fixed stack slot used to store the return address as immutable when lowering @llvm.returnaddress(i32 0). Unfortunately, within the function, it is not completely immutable: it is written during the function prologue. When using post-RA instruction scheduling, the prologue instructions are available for scheduling, and we're not free to interchange the order of a particular store in the prologue with loads from that stack location. Fixes PR21976. llvm-svn: 224761
* AVX-512: BLENDM - fixed encoding of the broadcast versionElena Demikhovsky2014-12-232-2/+3
| | | | | | Added more intrinsics and encoding tests. llvm-svn: 224760
* [PowerPC] Don't attempt a 64-bit pow2 division on PPC32Hal Finkel2014-12-231-0/+2
| | | | | | | | | | In r224033, in moving the signed power-of-2 division expansion into BuildSDIVPow2, I accidentally made it possible to attempt the lowering for a 64-bit division on PPC32. This later asserts. Fixes PR21928. llvm-svn: 224758
* [ARM] Don't break alignment when combining base updates into load/stores.Ahmed Bougacha2014-12-231-2/+47
| | | | | | | | | | | | | | | | | | r223862/r224203 tried to also combine base-updating load/stores. There was a mistake there: the alignment was added as is as an operand to the ARMISD::VLD/VST node. However, the VLD/VST selection logic doesn't care about less-than-standard alignment attributes. For example, no matter the alignment of a v2i64 load (say 1), SelectVLD picks VLD1q64 (because of the memory type). But VLD1q64 ("vld1.64 {dXX, dYY}") is 8-aligned, per ARMARMv7a 3.2.1. For the 1-aligned load, what we really want is VLD1q8. This commit introduces bitcasts if necessary, and changes the vld/vst type to one whose standard alignment matches the original load/store alignment. Differential Revision: http://reviews.llvm.org/D6759 llvm-svn: 224754
* Fix UBSan bootstrap: replace shift of negative value with multiplication.Alexey Samsonov2014-12-231-1/+1
| | | | llvm-svn: 224752
* X86: Don't over-align combined loads.Jim Grosbach2014-12-231-8/+3
| | | | | | | | | | | When combining consecutive loads+inserts into a single vector load, we should keep the alignment of the base load. Doing otherwise can, and does, lead to using overly aligned instructions. In the included test case, for example, using a 32-byte vmovaps on a 16-byte aligned value. Oops. rdar://19190968 llvm-svn: 224746
* Make musttail more robust for vector types on x86Reid Kleckner2014-12-222-100/+107
| | | | | | | | | | | | | | | | Previously I tried to plug musttail into the existing vararg lowering code. That turned out to be a mistake, because non-vararg calls use significantly different register lowering, even on x86. For example, AVX vectors are usually passed in registers to normal functions and memory to vararg functions. Now musttail uses a completely separate lowering. Hopefully this can be used as the basis for non-x86 perfect forwarding. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6156 llvm-svn: 224745
* Thumb1 frame lowering: Mark CFI instructions with the FrameSetup flag.Adrian Prantl2014-12-221-7/+14
| | | | | | | | | | | | | Followup to r224294: ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224743
* [Hexagon] Adding memb instruction. Fixing whitespace in test from 224730.Colin LeMahieu2014-12-224-28/+18
| | | | llvm-svn: 224735
* [Hexagon] Adding classes and load unsigned byte instruction, updating usages.Colin LeMahieu2014-12-226-28/+123
| | | | llvm-svn: 224730
* [x86] Add vector @llvm.ctpop intrinsic custom loweringBruno Cardoso Lopes2014-12-221-0/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when ctpop is supported for scalar types, the expansion of @llvm.ctpop.vXiY uses vector element extractions, insertions and individual calls to @llvm.ctpop.iY. When not, expansion with bit-math operations is used for the scalar calls. Local haswell measurements show that we can improve vector @llvm.ctpop.vXiY expansion in some cases by using a using a vector parallel bit twiddling approach, based on: v = v - ((v >> 1) & 0x55555555); v = (v & 0x33333333) + ((v >> 2) & 0x33333333); v = ((v + (v >> 4) & 0xF0F0F0F) v = v + (v >> 8) v = v + (v >> 16) v = v & 0x0000003F (from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel) When scalar ctpop isn't supported, the approach above performs better for v2i64, v4i32, v4i64 and v8i32 (see numbers below). And even when scalar ctpop is supported, this approach performs ~2x better for v8i32. Here, x86_64 implies -march=corei7-avx without ctpop and x86_64h includes ctpop support with -march=core-avx2. == [x86_64h - new] v8i32: 0.661685 v4i32: 0.514678 v4i64: 0.652009 v2i64: 0.324289 == [x86_64h - old] v8i32: 1.29578 v4i32: 0.528807 v4i64: 0.65981 v2i64: 0.330707 == [x86_64 - new] v8i32: 1.003 v4i32: 0.656273 v4i64: 1.11711 v2i64: 0.754064 == [x86_64 - old] v8i32: 2.34886 v4i32: 1.72053 v4i64: 1.41086 v2i64: 1.0244 More work for other vector types will come next. llvm-svn: 224725
* AVX-512: Added all forms of BLENDM instructions,Elena Demikhovsky2014-12-223-55/+120
| | | | | | intrinsics, encoding tests for AVX-512F and skx instructions. llvm-svn: 224707
* Lower multiply-negate operation to mneg on AArch64Karthik Bhat2014-12-221-0/+4
| | | | | | | | | | | This patch pattern matches code such as- neg w8, w8 mul w8, w9, w8 to mneg w8, w8, w9 Review: http://reviews.llvm.org/D6754 llvm-svn: 224706
* [X86] Add hasSideEffects = 0 to CALLpcrel16. This matches what is inferred ↵Craig Topper2014-12-211-4/+5
| | | | | | from patterns for the 32-bit version. llvm-svn: 224692
* Enable (sext x) == C --> x == (trunc C) combineMatt Arsenault2014-12-211-21/+2
| | | | | | | | | Extend the existing code which handles this for zext. This makes this more useful for targets with ZeroOrNegativeOne BooleanContent and obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne) since the constant will now be shrunk to i1. llvm-svn: 224691
* [X86] Swap operand order in Intel syntax on a bunch of aliases.Craig Topper2014-12-201-18/+18
| | | | llvm-svn: 224687
* [X86] Swap operand order of imul aliases in Intel syntax. Also disable ↵Craig Topper2014-12-201-6/+6
| | | | | | printing of the alias instead of the real instruction. llvm-svn: 224686
* [X86] Remove '*' from asm strings in far call/jump aliases for Intel syntax.Craig Topper2014-12-201-11/+11
| | | | llvm-svn: 224685
* [X86] Don't swap the order of segment and offset in immediate form of far ↵Craig Topper2014-12-201-4/+4
| | | | | | call/jump in Intel syntax. llvm-svn: 224684
* ARM: further improve deprecated diagnosis (LDM)Saleem Abdulrasool2014-12-202-1/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARM ARM states: LDM/LDMIA/LDMFD: The SP can be in the list. However, ARM deprecates using these instructions with SP in the list. ARM deprecates using these instructions with both the LR and the PC in the list. LDMDA/LDMFA/LDMDB/LDMEA/LDMIB/LDMED: The SP can be in the list. However, instructions that include the SP in the list are deprecated. Instructions that include both the LR and the PC in the list are deprecated. POP: The SP can only be in the list before ARMv7. ARM deprecates any use of ARM instructions that include the SP, and the value of the SP after such an instruction is UNKNOWN. ARM deprecates the use of this instruction with both the LR and the PC in the list. Attempt to diagnose use of deprecated forms of these instructions. This mirrors the previous changes to diagnose use of the deprecated forms of STM in ARM mode. llvm-svn: 224682
* [X86] Immediate forms of far call/jump are not valid in x86-64.Craig Topper2014-12-201-16/+20
| | | | llvm-svn: 224678
* Remove unused variable and initialization.Eric Christopher2014-12-201-4/+1
| | | | llvm-svn: 224655
* Remove unused variable, initializer, and accessor.Eric Christopher2014-12-192-10/+4
| | | | llvm-svn: 224650
* R600: Remove outdated commentMatt Arsenault2014-12-191-4/+0
| | | | llvm-svn: 224648
* Masked load and store codegen - fixed 128-bit vectorsElena Demikhovsky2014-12-193-20/+71
| | | | | | | The codegen failed on 128-bit types on AVX2. I added patterns and in td files and tests. llvm-svn: 224647
* R600/SI: Only form min/max with 1 use.Matt Arsenault2014-12-191-1/+1
| | | | | | | If the condition is used for something else, this increases the number of instructions. llvm-svn: 224646
* Add the ExceptionHandling::MSVC enumerationReid Kleckner2014-12-192-5/+5
| | | | | | | | | | | | | | | It is intended to be used for a family of personality functions that have similar IR preparation requirements. Typically when interoperating with MSVC personality functions, bits of functionality need to be outlined from the main function into helper functions. There is also usually more than one landing pad per invoke, which does not match the LLVM IR landingpad representation. None of this is implemented yet. This change just adds a new enum that is active for *-windows-msvc and delegates to the EH removal preparation pass. No functionality change for other targets. llvm-svn: 224625
* Model sqrtss as a binary operation with one source operand tied to the ↵Sanjay Patel2014-12-191-58/+12
| | | | | | | | | | | destination (PR14221) This is a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ). That patch started to fix PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ), but it was not completed. Differential Revision: http://reviews.llvm.org/D6330 llvm-svn: 224624
* R600/SI: isLegalOperand() shouldn't check constant bus for SALU instructionsTom Stellard2014-12-191-1/+1
| | | | | | | The constant bus restrictions only apply to VALU instructions. This enables SIFoldOperands to fold immediates into SALU instructions. llvm-svn: 224623
* R600/SI: Make sure non-inline constants aren't folded into mubuf soffset operandTom Stellard2014-12-194-17/+25
| | | | | | | | mubuf instructions now define the soffset field using the SCSrc_32 register class which indicates that only SGPRs and inline constants are allowed. llvm-svn: 224622
* [Hexagon] Removing old variants of instructions and updating references.Colin LeMahieu2014-12-196-161/+13
| | | | llvm-svn: 224612
* [Hexagon] Adding bit extraction and table indexing instructions.Colin LeMahieu2014-12-191-0/+101
| | | | llvm-svn: 224610
* [Hexagon] Adding bit insertion instructions.Colin LeMahieu2014-12-191-0/+65
| | | | llvm-svn: 224609
* [Hexagon] Adding more xtype shift instructions.Colin LeMahieu2014-12-191-0/+107
| | | | llvm-svn: 224608
* [Hexagon] Adding xtype shift instructions.Colin LeMahieu2014-12-191-0/+198
| | | | llvm-svn: 224604
* [Hexagon] Adding transfers to and from control registers.Colin LeMahieu2014-12-192-0/+65
| | | | llvm-svn: 224599
* [Hexagon] Adding doubleregs for control registers. Renaming control ↵Colin LeMahieu2014-12-194-22/+66
| | | | | | register class. llvm-svn: 224598
* [ARM] Remove dead assignment.Tilmann Scheller2014-12-191-1/+0
| | | | | | Found by the Clang static analyzer. llvm-svn: 224586
* [Hexagon] Adding loop0/1 sp0/1/2loop0 instructions.Colin LeMahieu2014-12-196-37/+138
| | | | llvm-svn: 224556
* Reverting 224550, was not ready for commit.Colin LeMahieu2014-12-186-134/+33
| | | | llvm-svn: 224552
* [Hexagon] Adding loop0/1 sp0/1/2loop0 instructions.Colin LeMahieu2014-12-186-33/+134
| | | | llvm-svn: 224550
* [mips][microMIPS] Fix bugs related to atomic SC/LL instructionsJozef Kolek2014-12-181-4/+8
| | | | | | | | | Fix bugs related to atomic microMIPS SC/LL instructions: While expanding atomic operations the mips32r2 encoding was emitted instead of microMIPS. Differential Revision: http://reviews.llvm.org/D6659 llvm-svn: 224524
* ARM: fix an off-by-one in the register list accessSaleem Abdulrasool2014-12-181-2/+2
| | | | | | | | | Fix an off-by-one access introduced in 224502 for push.w and pop.w with single register operands. Add test cases for both scenarios. Thanks to Asiri Rathnayake for pointing out the failure! llvm-svn: 224521
* [AVX512] Enable FP arithmetic lowering for AVX512VL subsets.Robert Khasanov2014-12-184-2/+105
| | | | | | | Added RegOp2MemOpTable4 to transform 4th operand from register to memory in merge-masked versions of instructions. Added lowering tests. llvm-svn: 224516
* ARM: improve instruction validation for thumb modeSaleem Abdulrasool2014-12-181-15/+76
| | | | | | | | | | | | | | | | | | | | | | | | | The ARM Architecture Reference Manual states the following: LDM{,IA,DB}: The SP cannot be in the list. The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. POP: The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. PUSH: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. STM:{,IA,DB}: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. llvm-svn: 224502
* [PowerPC] Use MCPhysReg for tables of registers. Const-correct the tables. ↵Craig Topper2014-12-181-12/+12
| | | | | | Only put the anonymous namespace around classes. NFC. llvm-svn: 224498
* [X86] Use correct opsize on indirect call and jump aliases.Craig Topper2014-12-181-4/+4
| | | | llvm-svn: 224497
OpenPOWER on IntegriCloud