summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix indentation. NFCI.Simon Pilgrim2017-02-131-1/+1
| | | | llvm-svn: 294959
* [CodeGen] fix alignment of JUMPTABLE_INSTS on v8M.baseSanne Wouda2017-02-131-0/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: The attached test case fails with "fatal error: error in backend: misaligned pc-relative fixup value" as the jump table is misaligned. The EmitAlignment existed already for ARM and Thumb-1 code, but was missing for Thumb-2. The test checks that the fatal error disappears when generating an obj file, as well as checking the align directive is there when producing an asm file. Reviewers: rengolin, grosbach, t.p.northover, jmolloy, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D29650 llvm-svn: 294950
* [Thumb-1] TBB generation: spot redefinitions of index registerJames Molloy2017-02-131-1/+17
| | | | | | | | | | | | | We match a sequence of 3-4 instructions into a tTBB pseudo. One of our checks is that a particular register in that sequence is killed (so it can be clobbered by the pseudo). We weren't noticing if an errant MOV or other instruction had infiltrated the sequence we were walking. If it had, and it defined the register we've already identified as killed, it makes it live across the tBR_JT and thus unclobberable. Notice this case and bail out. llvm-svn: 294949
* [ARM] Register ConstantIslands with the pass managerJames Molloy2017-02-133-1/+8
| | | | | | | This allows us to use -stop-before/-stop-after/-run-pass - we can now write .mir tests. llvm-svn: 294948
* [ARM] Use VCMP, not VCMPE, for floating point equality comparisonsJames Molloy2017-02-135-29/+60
| | | | | | | | | | | | | | | | | | | | | | | | | When generating a floating point comparison we currently unconditionally generate VCMPE. This has the sideeffect of setting the cumulative Invalid bit in FPSCR if any of the operands are QNaN. It is expected that use of a relational predicate on a QNaN value should raise Invalid. Quoting from the C standard: The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of relationships the less, greater, equal and is true. Relational operators may raise the floating-point exception when argument values are NaNs. The standard doesn't explicitly state the expectation for equality operators, but the implication and obvious expectation is that equality operators should not raise Invalid on a QNaN input, as those predicates are wholly defined on unordered inputs (to return not equal). Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if QNaN should raise Invalid, and pipe that through to TableGen. llvm-svn: 294945
* [X86][SSE] Create matchVectorShuffleWithUNPCK helper function.Simon Pilgrim2017-02-131-46/+42
| | | | | | Currently only used by target shuffle combining - will use it for lowering as well in a future patch. llvm-svn: 294943
* [X86][AVX512] Fix operand classes for some AVX512 instructions to keep ↵Ayman Musa2017-02-131-17/+20
| | | | | | | | consistency between VEX/EVEX versions of the same instruction. Differential Revision: https://reviews.llvm.org/D29873 llvm-svn: 294937
* [X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR ↵Craig Topper2017-02-131-21/+18
| | | | | | | | to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931
* [X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR.Craig Topper2017-02-121-5/+7
| | | | | | This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. llvm-svn: 294929
* [X86] Fix typo in function name. NFCI.Simon Pilgrim2017-02-121-2/+2
| | | | | | convertBitVectorToUnsiged - convertBitVectorToUnsigned llvm-svn: 294914
* [AVX-512] Add various EVEX move instructions to load folding tables using ↵Craig Topper2017-02-121-4/+10
| | | | | | the VEX equivalents as a guide. llvm-svn: 294908
* [AVX-512] Add VMOV64toSDZrm CodeGenOnly instruction based on the same ↵Craig Topper2017-02-121-0/+4
| | | | | | | | instruction from AVX/SSE. I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables. llvm-svn: 294907
* [X86] Fix a couple instruction names to use 'mr' instead of 'rm' to indicate ↵Craig Topper2017-02-121-2/+2
| | | | | | they are stores. AVX-512 version was already named with 'mr'. llvm-svn: 294906
* [AVX-512] Add VPEXTRD/Q to load folding tables.Craig Topper2017-02-121-0/+2
| | | | llvm-svn: 294905
* [X86][SSE] Update argument names to match function name. NFCI.Simon Pilgrim2017-02-121-12/+13
| | | | | | The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. llvm-svn: 294900
* [X86][AVX2] Add support for combining target shuffles to VPMOVZXSimon Pilgrim2017-02-121-6/+11
| | | | | | Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. llvm-svn: 294896
* AMDGPU::expandMemIntrinsicUses(): Fix an uninitialized variable. This ↵NAKAMURA Takumi2017-02-121-1/+1
| | | | | | function returned true or undef. llvm-svn: 294895
* AVX-512: Fixed DWARF register numbers for XMM16-31Elena Demikhovsky2017-02-121-16/+16
| | | | | | | The reference is here: https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf llvm-svn: 294890
* [X86] Move code for using blendi for insert_subvector out to an isel ↵Craig Topper2017-02-112-27/+53
| | | | | | pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend. llvm-svn: 294876
* [X86][SSE] Use VSEXT/VZEXT constant folding for ↵Simon Pilgrim2017-02-111-1/+6
| | | | | | | | SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG Preparatory step for PR31712 llvm-svn: 294874
* [X86][SSE] Improve VSEXT/VZEXT constant folding.Simon Pilgrim2017-02-111-11/+18
| | | | | | Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR . llvm-svn: 294873
* [X86][SSE] Add early-out when trying to match blend shuffle. NFCI.Simon Pilgrim2017-02-111-3/+4
| | | | llvm-svn: 294864
* Fix indentation in X86ISelLowering. NFCAmaury Sechet2017-02-111-8/+8
| | | | llvm-svn: 294859
* [AVX-512] Add VPMINS/MINU/MAXS/MAXU instructions to load folding tables.Craig Topper2017-02-111-0/+136
| | | | llvm-svn: 294858
* [X86] Improve alphabetizing of load folding tables. NFCCraig Topper2017-02-111-18/+18
| | | | llvm-svn: 294857
* [X86][SSE] Convert getTargetShuffleMaskIndices to use ↵Simon Pilgrim2017-02-111-75/+25
| | | | | | | | | | getTargetConstantBitsFromNode. Removes duplicate constant extraction code in getTargetShuffleMaskIndices. getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits. llvm-svn: 294856
* [X86] Merge repeated getScalarValueSizeInBits calls. NFCI.Simon Pilgrim2017-02-111-7/+7
| | | | llvm-svn: 294852
* [X86][3DNow!] Enable PFSUB<->PFSUBR commutationSimon Pilgrim2017-02-112-2/+14
| | | | llvm-svn: 294847
* [X86][3DNow!] Enable commutation for PFADD/PFMUL/PFCMPEQ/PAVGUSB/PMULHRWSimon Pilgrim2017-02-111-8/+10
| | | | | | | | All commutations confirmed to give identical results - note PFMAX/PFMIN do not PFSUB<->PFSUBR should be commutable as well llvm-svn: 294846
* Fix "left shift of negative value -1" introduced by r294805Vitaly Buka2017-02-111-1/+1
| | | | llvm-svn: 294843
* Move symbols from the global namespace into (anonymous) namespaces. NFC.Benjamin Kramer2017-02-114-6/+7
| | | | llvm-svn: 294837
* [AVX-512] Add VPINSRB/W/D/Q instructions to load folding tables.Craig Topper2017-02-111-0/+4
| | | | llvm-svn: 294830
* [AVX-512] Fix apparent typo in instruction name VMOVSSDrr_REV->VMOVSDZrr_REV.Craig Topper2017-02-111-1/+1
| | | | llvm-svn: 294829
* [AVX-512] Add VPSADBW instructions to load folding tables.Craig Topper2017-02-111-0/+3
| | | | llvm-svn: 294827
* [X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 ↵Craig Topper2017-02-111-4/+19
| | | | | | | | | | | | is available. Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available. This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp. Overall I think this produces better results in the modified test cases. llvm-svn: 294824
* [ARM] Make f16 interleaved accesses expensive.Ahmed Bougacha2017-02-111-1/+2
| | | | | | | | | | | | | There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Teach the cost model to consider f16 interleaved operations as expensive. Otherwise, we are all but guaranteed to end up with a large block of scalarized vector code. llvm-svn: 294819
* [ARM] Don't lower f16 interleaved accesses.Ahmed Bougacha2017-02-111-0/+14
| | | | | | | | | | | | There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818
* [WebAssembly] Remove old experimental disassemler code.Dan Gohman2017-02-111-84/+2
| | | | | | | Remove support for disassembling an old experimental wasm binary format, which is no longer in use anywhere. llvm-svn: 294809
* [Hexagon] Introduce Hexagon V62Krzysztof Parzyszek2017-02-1018-61/+4032
| | | | llvm-svn: 294805
* [PPC] Silence warning in Release builds.Benjamin Kramer2017-02-101-2/+1
| | | | llvm-svn: 294791
* Fix a silly syntax error.Tim Shen2017-02-101-2/+2
| | | | llvm-svn: 294783
* [XRay] Implement powerpc64le xray.Tim Shen2017-02-103-2/+101
| | | | | | | | | | | | | | | | | | Summary: powerpc64 big-endian is not supported, but I believe that most logic can be shared, except for xray_powerpc64.cc. Also add a function InvalidateInstructionCache to xray_util.h, which is copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest, and I don't know how. Reviewers: dberris, echristo, iteratee, kbarton, hfinkel Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29742 llvm-svn: 294781
* [Hexagon] Remove unused .td filesKrzysztof Parzyszek2017-02-107-2572/+0
| | | | llvm-svn: 294775
* [X86] Bitcast subvector before broadcasting it.Ahmed Bougacha2017-02-101-1/+10
| | | | | | | | | | | | | Since r274013, we've been looking through bitcasts on broadcast inputs. In the scalar-folding case (from a load, build_vector, or sc2vec), the input type didn't matter, as we'd simply bitcast the resulting scalar back. However, when broadcasting a 128-bit-lane-aligned element, we create an EXTRACT_SUBVECTOR. Use proper types, by creating an extract_subvector of the original input type. llvm-svn: 294774
* [ARM] Fix incorrect mask bits in MSR encoding for write_register intrinsicJohn Brawn2017-02-101-10/+6
| | | | | | | | | | | In the encoding of system registers in the M-class MSR instruction the mask bits should be 2 for registers that don't take a _<bits> qualifier (the instruction is unpredictable otherwise), and should also be 2 if the register takes a _<bits> qualifier but it's not present as no _<bits> is an alias for _nzcvq. Differential Revision: https://reviews.llvm.org/D29828 llvm-svn: 294762
* [Hexagon] Replace instruction definitions with auto-generated onesKrzysztof Parzyszek2017-02-1037-12829/+48409
| | | | llvm-svn: 294753
* Move some error handling down to MCStreamer.Rafael Espindola2017-02-102-2/+2
| | | | | | | | | This makes sure we get the same redefinition rules regardless of who is printing (asm parser, codegen) and to what (asm, obj). This fixes an unintentional regression in r293936. llvm-svn: 294752
* [X86][SSE] Use SDValue::getConstantOperandVal helper. NFCI.Simon Pilgrim2017-02-101-11/+6
| | | | | | Also reordered an if statement to test low cost comparisons first llvm-svn: 294748
* [X86][SSE] Add support for extracting target constants from BUILD_VECTORSimon Pilgrim2017-02-101-0/+17
| | | | | | | In some cases we call getTargetConstantBitsFromNode for nodes that haven't been lowered from BUILD_VECTOR yet Note: We're getting very close to being able to move most of the constant extraction code from getTargetShuffleMaskIndices into getTargetConstantBitsFromNode llvm-svn: 294746
* [X86][SSE] Add missing comment describing combing to SHUFPS. NFCISimon Pilgrim2017-02-101-0/+2
| | | | llvm-svn: 294745
OpenPOWER on IntegriCloud