summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [Thumb-1] Add optimized constant materialization for integers [256..512)James Molloy2016-06-072-0/+13
| | | | | | We can materialize these integers using a MOV; ADDi8 pair. llvm-svn: 272007
* [AVX512] Fix load opcode for fast isel.Igor Breger2016-06-071-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D21067 llvm-svn: 272006
* [PowerPC] Support multiple return values with fast iselUlrich Weigand2016-06-071-1/+1
| | | | | | | | | | | | | | | Using an LLVM IR aggregate return value type containing three or more integer values causes an abort in the fast isel pass. This patch adds two more registers to RetCC_PPC64_ELF_FIS to allow returning up to four integers with fast isel, just the same as is currently supported with regular isel (RetCC_PPC). This is needed for Swift and (possibly) other non-clang frontends. Fixes PR26190. llvm-svn: 272005
* [X86][SSE] Improved blend+zero target shuffle combining to use combined ↵Simon Pilgrim2016-06-071-7/+11
| | | | | | | | | | shuffle mask directly We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened. This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases. llvm-svn: 272003
* [ARM] Shrink post-indexed LDR and STR to LDM/STMJames Molloy2016-06-071-0/+42
| | | | | | | | | | | | | | A Thumb-2 post-indexed LDR instruction such as: ldr.w r0, [r1], #4 Can be rewritten as: ldm.n r1!, {r0} LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode. llvm-svn: 272002
* [ARM] Transform LDMs into writeback form to save code sizeJames Molloy2016-06-071-3/+23
| | | | | | | | | | | | | | If we have an LDM that uses only low registers and doesn't write to its base register: ldm.w r0, {r1, r2, r3} And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding: ldm.n r0!, {r1, r2, r3} Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't. llvm-svn: 272000
* [ARM] Incorrect relocation type for Thumb2 B<cond>.wPeter Smith2016-06-071-0/+2
| | | | | | | | | | | | | | | | | The Thumb2 conditional branch B<cond>.W has a different encoding (T3) to the unconditional branch B.W (T4) as it needs to record <cond>. As the encoding is different the B<cond>.W is given a different relocation type. ELF for the ARM Architecture 4.6.1.6 (Table-13) states that R_ARM_THM_JUMP19 should be used for B<cond>.W. At present the MC layer is using the R_ARM_THM_JUMP24 from B.W. This change makes B<cond>.W use R_ARM_THM_JUMP19 and alters the existing test that checks for R_ARM_THM_JUMP24 to expect R_ARM_THM_JUMP19. llvm-svn: 271997
* [AVX512] Allow avx2 and sse41 nontemporal load intrinsics to select EVEX ↵Craig Topper2016-06-072-11/+13
| | | | | | encoded instructions when VLX is enabled. llvm-svn: 271988
* [AVX512] Remove unnecessary mayLoad, mayStore, hasSidEffects flags from ↵Craig Topper2016-06-072-580/+488
| | | | | | instructions that have patterns that imply them. Add the same set of flags to instructions that don't have patterns to imply them. llvm-svn: 271987
* [AVX512] Add NoVLX to a couple patterns that have VLX equivalents. Ordering ↵Craig Topper2016-06-071-1/+1
| | | | | | of the patterns in the .td file protects this, but its better to be explicit. llvm-svn: 271986
* ARM: correct TLS access on WoASaleem Abdulrasool2016-06-075-4/+25
| | | | | | | | | | | | TLS access requires an offset from the TLS index. The index itself is the section-relative distance of the symbol. For ARM, the relevant relocation (IMAGE_REL_ARM_SECREL) is applied as a constant. This means that the value may not be an immediate and must be lowered into a constant pool. This offset will not be base relocated. We were previously emitting the actual address of the symbol which would be base relocated and would therefore be the vaue offset by the ImageBase + TLS Offset. llvm-svn: 271974
* ARM: clang-format a couple of switches, add commentsSaleem Abdulrasool2016-06-073-15/+25
| | | | | | | clang-format a couple of switches in preparation for a future change. Add some enumeration comments llvm-svn: 271973
* ARM: normalise space in the patternsSaleem Abdulrasool2016-06-071-8/+7
| | | | | | Just adjust the whitespace for the selection patterns. NFC. llvm-svn: 271972
* AMDGPU: Add function for getting instruction sizeMatt Arsenault2016-06-062-0/+51
| | | | llvm-svn: 271936
* AMDGPU: Fix constantexpr addrspacecastsMatt Arsenault2016-06-062-4/+71
| | | | | | | | If we had a constant group address space cast the queue pointer wasn't enabled for the function, resulting in a crash on noreg later. llvm-svn: 271935
* [AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when ↵Artem Tamazov2016-06-062-14/+76
| | | | | | | | | | | | src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 llvm-svn: 271900
* [KNL] Fix UMULO lowering.Igor Breger2016-06-061-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D21013 llvm-svn: 271891
* Remove dead function with incredibly broken assert.Benjamin Kramer2016-06-061-6/+0
| | | | | | Found by clang-tidy's misc-assert-side-effect. llvm-svn: 271887
* [NFC] Silence gcc warning (-Wsign-compare)Filipe Cabecinhas2016-06-061-1/+1
| | | | llvm-svn: 271882
* [AVX512] Add PALIGNR shuffle lowering for v32i16 and v16i32.Craig Topper2016-06-061-0/+13
| | | | llvm-svn: 271870
* [X86][XOP] Added VPERMIL2PD/VPERMIL2PS raw mask decoding for target shuffle ↵Simon Pilgrim2016-06-053-0/+49
| | | | | | combines llvm-svn: 271834
* [X86][XOP] Added VPERMIL2PD/VPERMIL2PS as a target shuffle typeSimon Pilgrim2016-06-051-0/+16
| | | | llvm-svn: 271831
* [X86][XOP] Tidied up DecodeVPERMIL2PMask to more closely match ↵Simon Pilgrim2016-06-051-3/+5
| | | | | | DecodeVPERMILPMask. llvm-svn: 271830
* [AVX512] Add support for lowering PALIGNR for v64i8.Craig Topper2016-06-051-0/+5
| | | | | | Could do this for other types to, but this is what's needed to replace the instrinsic with native IR in clang. llvm-svn: 271828
* [AVX512] Fix PANDN combining for v4i32/v8i32 when VLX is enabled.Craig Topper2016-06-051-1/+2
| | | | | | v4i32/v8i32 ANDs aren't promoted to v2i64/v4i64 when VLX is enabled. llvm-svn: 271826
* [X86][XOP] Added VPERMIL2PD/VPERMIL2PS shuffle mask comment decodingSimon Pilgrim2016-06-043-0/+109
| | | | llvm-svn: 271809
* [X86] Add the VR128L/H and VR256L/H to the list of vector register classes ↵Craig Topper2016-06-041-1/+5
| | | | | | for inline asm constraints. Also fix the comment on the function. llvm-svn: 271802
* X86: enable TLS on Windows itaniumSaleem Abdulrasool2016-06-041-0/+1
| | | | | | | Windows itanium is nearly identical to windows-msvc (MS ABI for C, itanium for C++). Enable the TLS support for the target similar to the MSVC model. llvm-svn: 271797
* [X86][AVX2] Fix v16i16 SHL lowering (PR27730)Simon Pilgrim2016-06-041-2/+2
| | | | | | | | The AVX2 v16i16 shift lowering works by unpacking to 2 x v8i32, performing the shift and then truncating the result. The unpacking is used to place the values in the upper 16-bits so that we can correctly sign-extend for SRA shifts. Unfortunately we weren't ensuring that the lower 16-bits were zero to ensure that SHL correctly shifts in zero bits. llvm-svn: 271796
* [X86] Use smaller types to shrink the intrinsic lowering tables by about 12K.Craig Topper2016-06-041-6/+6
| | | | llvm-svn: 271776
* [X86] Use X86ISD::ABS for lowering pabs SSSE3/AVX intrinsics to match ↵Craig Topper2016-06-042-33/+36
| | | | | | AVX512. Should allow those intrinsics to use the EVEX encoded instructions and get the extra registers when available. llvm-svn: 271775
* [AArch64] Spot SBFX-compatible code expressed with sign_extend.Chad Rosier2016-06-031-0/+30
| | | | | | | This is very similar to r271677, but for extracts from i32 with the SIGN_EXTEND acting on a arithmetic shift. llvm-svn: 271717
* [WebAssembly] Emit type signatures for declared functionsDerek Schuff2016-06-033-10/+56
| | | | | | | | | | | | | | | | | | | | | Under emscripten, C code can take the address of a function implemented in Javascript (which is exposed via an import in wasm). Because imports do not have linear memory address in wasm, we need to generate a thunk to be the target of the indirect call; it call the import directly. To make this possible, LLVM needs to emit the type signatures for these functions, because they may not be called directly or referred to other than where the address is taken. This uses s new .s directive (.functype) which specifies the signature. Differential Revision: http://reviews.llvm.org/D20891 Re-apply r271599 but instead of bailing with an error when a declared function has multiple returns, replace it with a pointer argument. Also add the test case I forgot to 'git add' last time around. llvm-svn: 271703
* Code size optimisation: do not inline memcpy if this expansion resultsSjoerd Meijer2016-06-031-0/+6
| | | | | | | | in more instructions than the libary call. Differential Revision: http://reviews.llvm.org/D20958 llvm-svn: 271678
* [AArch64] Spot SBFX-compatbile code expressed with sign_extend_inreg.Chad Rosier2016-06-031-0/+37
| | | | | | | | | | We were assuming all SBFX-like operations would have the shl/asr form, but often when the field being extracted is an i8 or i16, we end up with a SIGN_EXTEND_INREG acting on a shift instead. This is a port of r213754 from ARM to AArch64. llvm-svn: 271677
* [test/AMDGPU] Square-braced-syntax for registers: add macro test/example.Artem Tamazov2016-06-031-19/+45
| | | | | | | | | | Test added as per discussion in http://reviews.llvm.org/D20588. The macro is just a demonstration, useless in practice. Coding style fixes. Differential Revision: http://reviews.llvm.org/D20797 llvm-svn: 271675
* RAS extensions are part of ARMv8.2-A. This change enables them by introducing aSjoerd Meijer2016-06-0313-24/+131
| | | | | | | | | | new instruction to ARM and AArch64 targets and several system registers. Patch by: Roger Ferrer Ibanez and Oliver Stannard Differential Revision: http://reviews.llvm.org/D20282 llvm-svn: 271670
* ARM target does not use printAliasInstr machinery whichSjoerd Meijer2016-06-037-137/+100
| | | | | | | | | | | | | | | | | | forces having special checks in ArmInstPrinter::printInstruction. This patch addresses this issue. Not all special checks could be removed: either they involve elaborated conditions under which the alias is emitted (e.g. ldm/stm on sp may be pop/push but only if the number of registers is >= 2) or the number of registers is multivalued (like happens again with ldm/stm) and they do not match the InstAlias pattern which assumes single-valued operands in the pattern. Patch by: Roger Ferrer Ibanez Differential Revision: http://reviews.llvm.org/D20237 llvm-svn: 271667
* [AMDGPU] Assembler: More tests for SDWA instructions. Fix for SDWA float ↵Sam Kolton2016-06-032-16/+23
| | | | | | | | | | | | | | modifiers. Summary: Depends on D20625 Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20674 llvm-svn: 271662
* [mips] EABI CodeGen is completely untested and seems to have bitrotted. ↵Daniel Sanders2016-06-037-69/+3
| | | | | | | | | | | | | | | | | | Remove it. Summary: There are no tests*, no EABI buildbots, and simple test cases do not work. * There is a single MIPS16 test using a mips*-gnueabi triple but this test doesn't test EABI and the triple doesn't cause EABI to be used. Reviewers: sdardis Subscribers: tberghammer, danalbert, srhines, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D20906 llvm-svn: 271658
* [AMDGPU] Assembler: Custom converters for SDWA instructions. Support for ↵Sam Kolton2016-06-032-62/+162
| | | | | | | | | | | | | | | | _dpp and _sdwa suffixes in mnemonics. Summary: Added custom converters for SDWA instruction to support optional operands and modifiers. Support for _dpp and _sdwa suffixes that allows to force DPP or SDWA encoding for instructions. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20625 llvm-svn: 271655
* Remove bogus initialization of the PPC and Hexagon SelectionDAGISelChandler Carruth2016-06-032-37/+2
| | | | | | | | | | | | | | | | | subclasses. These are not passes proper. We don't support registering them, they can't be constructed with default arguments, and the ID is actually in a base class. Only these two targets even had any boiler plate to try to do this, and it had to be munged out of the INITIALIZE_PASS macros to work. What's worse, the boiler plate has rotted and the "name" of the pass is actually the description string now!!! =/ All of this is completely unnecessary. No other target bothers, and nothing breaks if you don't initialize them because CodeGen has an entirely separate initialization path that is somewhat more durable than relying on the implicit initialization the way the 'opt' tool does for registered passes. llvm-svn: 271650
* Use the standard INITIALIZE_PASS macro rather than hand rolling a (notChandler Carruth2016-06-031-9/+2
| | | | | | entirely correct) version of its contents. llvm-svn: 271649
* [mips] Implement 'la' macro in PIC mode for O32.Daniel Sanders2016-06-034-35/+83
| | | | | | | | | | | | | | | | Summary: N32 support will follow in a later patch since the symbol version of 'la' incorrectly believes N32 to have 64-bit pointers and rejects it early. This fixes the three incorrectly expanded 'la' macros found in bionic. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D20820 llvm-svn: 271644
* [X86][XOP] Support for VPERMIL2PD/VPERMIL2PS 2-input shuffle instructionsSimon Pilgrim2016-06-035-17/+39
| | | | | | | | | | | | This patch begins adding support for lowering to the XOP VPERMIL2PD/VPERMIL2PS shuffle instructions - adding the X86ISD::VPERMIL2 opcode and cleaning up the usage. The internal llvm intrinsics were assuming the shuffle mask operand was the same type as the float/double input operands (I guess to simplify the intrinsic definitions in X86InstrXOP.td to a single value type). These needed changing to integer types (matching the clang builtin and the AMD intrinsics definitions), an auto upgrade path is added to convert old calls. Mask decoding/target shuffle support will be added in future patches. Differential Revision: http://reviews.llvm.org/D20049 llvm-svn: 271633
* [X86] Fix some isel patterns to remove an operand from some multiclasses. NFCCraig Topper2016-06-031-65/+64
| | | | llvm-svn: 271631
* [AVX512] Ensure EVEX vpshufd, vpshuflw, and vpshufhw have isel priority over ↵Craig Topper2016-06-031-6/+8
| | | | | | the VEX encoded ones. llvm-svn: 271629
* [AVX512] Fix shuffle comment printing for EVEX encoded PSHUFD, PSHUFHW, and ↵Craig Topper2016-06-031-29/+17
| | | | | | PSHUFLW. llvm-svn: 271628
* [X86] Simplify a multiclass to remove a parameter. NFCCraig Topper2016-06-031-31/+30
| | | | llvm-svn: 271627
* [X86] Remove unnecessary pattern predicates from the vector bit cast ↵Craig Topper2016-06-032-100/+94
| | | | | | patterns. The types have to be legal and there are no alternative patterns. Saves almost 200 bytes in isel table. llvm-svn: 271625
OpenPOWER on IntegriCloud