summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Revert part of "AArch64: Do not test for CPUs, use SubtargetFeatures"Evandro Menezes2016-09-202-6/+0
| | | | | | | | This reverts part of commit 119e358d9635c8d1f3e7aee67e3ea3b8a62f8db6 by removing FeatureUseRSqrt et al per request by Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW). llvm-svn: 282001
* Revert "[AArch64] Use the reciprocal estimation machinery"Evandro Menezes2016-09-205-101/+3
| | | | | | | This reverts commit b7d42b0048f65346e9fa37fb65defeea7ce8c337 per request by Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW). llvm-svn: 282000
* Revert "[AArch64] Properly validate the reciprocal estimation."Evandro Menezes2016-09-201-6/+0
| | | | | | | This reverts commit ad8ca1528242e2a4cb363e3779309e70eb7a430e per request by Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW). llvm-svn: 281999
* X86: loosen an overly aggressive MachO assertionSaleem Abdulrasool2016-09-201-2/+6
| | | | | | | | | | | | | We would assert that the FP setup CFI used esp/rsp always. This held up in practice when the code was generated from IR. However, with the integrated assembler, it is possible to have the input be user specified assembly. In such a case, we cannot assume that the function implementation has a compact unwind representation. Loosen the assertion into a check and bail if we cannot represent the frame pointer in the compact unwinding. Addresses PR30453! llvm-svn: 281986
* Remove more guts of TargetMachine::getNameWithPrefix and migrate one check ↵Eric Christopher2016-09-201-8/+1
| | | | | | | | to the TLOF mach-o version. NFC intended. llvm-svn: 281983
* Remove a use of subtarget initialization in the X86 backend so we can get ↵Eric Christopher2016-09-201-1/+4
| | | | | | | | rid of the default subtarget. NFC intended. llvm-svn: 281982
* Remove extra argument used once on TargetMachine::getNameWithPrefix and ↵Eric Christopher2016-09-201-3/+2
| | | | | | inline the result into the singular caller. llvm-svn: 281981
* GlobalISel: split aggregates for PCS loweringTim Northover2016-09-202-41/+136
| | | | | | | | | | | This should match the existing behaviour for passing complicated struct and array types, in particular HFAs come through like that from Clang. For C & C++ we still need to somehow support all the weird ABI flags, or at least those that are present in the IR (signext, byval, ...), and stack-based parameter passing. llvm-svn: 281977
* AVX-512: Fixed a bug in lowering saturated operations on KNL.Elena Demikhovsky2016-09-201-2/+8
| | | | | | | | The generated code is still not optimal. Differential Revision: https://reviews.llvm.org/D24723 llvm-svn: 281966
* [AMDGPU] Refactor VOP3 instruction TD definitionsValery Pykhtin2016-09-206-373/+448
| | | | | | Differential revision: https://reviews.llvm.org/D24664 llvm-svn: 281965
* [AVX-512] Teach X86InstrInfo::copyPhysReg to use a 512-bit move if ↵Craig Topper2016-09-203-5/+38
| | | | | | | | XMM16-XMM31 or YMM16-YMM31 are the source or dest of the copy and VLX is not supported. This can happen with SUBREG_TO_REG of ZMM16-ZMM31. Fixes PR30430. llvm-svn: 281959
* [AVX-512] Use 512-bit vcvtps2ph/vcvtph2ps to implement fp_to_f16/f16_to_fp ↵Craig Topper2016-09-203-2/+32
| | | | | | | | when F16C and VLX are not supported. Fixes PR23941. llvm-svn: 281958
* [x86] fix variable names; NFCSanjay Patel2016-09-201-22/+23
| | | | llvm-svn: 281953
* [x86] use getSignBit() to simplify code; NFCISanjay Patel2016-09-191-4/+3
| | | | llvm-svn: 281944
* [AMDGPU] Refactor VOPC instruction TD definitionsValery Pykhtin2016-09-196-648/+1118
| | | | | | Differential Revision: https://reviews.llvm.org/D24546 llvm-svn: 281903
* [AArch64] Fix encoding for lsl #12 in add/sub immediatesDiana Picus2016-09-191-2/+2
| | | | | | | | | | | Whenever an add/sub immediate needs a fixup, we set that immediate field to zero, which is correct, but we also set the shift bits to zero, which is not true for instructions that use lsl #12. This patch makes sure that if lsl #12 was used, it will appear in the encoding of the instruction. Differential Revision: https://reviews.llvm.org/D23930 llvm-svn: 281898
* [AMDGPU] Fix s_branch with -1 offsetSam Kolton2016-09-191-5/+2
| | | | | | | | | | | | | | | | | | | Summary: In case s_branch instruction target is itself backend should emit offset -1 but instead it emit 0. ''' label: s_branch label // should emit [0xff,0xff,0x82,0xbf] ''' Tom, Matt: why are we adjusting fixup values in applyFixup() method instead of processFixup()? processFixup() is calling adjustFixupValue() but does nothing with its result. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl Differential Revision: https://reviews.llvm.org/D24671 llvm-svn: 281896
* [Thumb] Set correct initial mapping symbol for big-endian thumbOliver Stannard2016-09-191-1/+2
| | | | | | | | | | The initial mapping symbol state is set from the triple, but we only checked for the little-endian thumb triple, so could end up with an ARM mapping symbol for big-endian thumb. Differential Revision: https://reviews.llvm.org/D24553 llvm-svn: 281894
* ARM: check alignment before transforming ldr -> ldm (or similar).Tim Northover2016-09-192-8/+28
| | | | | | | | | ldm and stm instructions always require 4-byte alignment on the pointer, but we weren't checking this before trying to reduce code-size by replacing a post-indexed load/store with them. Unfortunately, we were also dropping this incormation in DAG ISel too, but that's easy enough to fix. llvm-svn: 281893
* [X86,AVX-512] Use INSERT_SUBREG instead of SUBREG_TO_REG when the input is ↵Craig Topper2016-09-192-32/+44
| | | | | | | | not the output of an instruction. SUBREG_TO_REG is supposed to indicate that the super register has been zeroed, but we can't prove that if we don't know where it came from. llvm-svn: 281885
* [AVX-512] Add support for lowering fp_to_f16 and f16_to_fp when VLX is ↵Craig Topper2016-09-193-2/+23
| | | | | | | | supported regardless of whether F16C is also supported. Still need to add support for lowering using AVX512F when neither VLX or F16C is supported. llvm-svn: 281884
* [XRay] ARM 32-bit no-Thumb support in LLVMDean Michael Berris2016-09-199-33/+128
| | | | | | | | | | | | This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: https://reviews.llvm.org/D23932 (Clang test) https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 281878
* [AVX-512] Don't lower CVTPD2PS intrinsics to ISD::FP_ROUND with an X86 ↵Craig Topper2016-09-182-3/+28
| | | | | | | | rounding mode encoding in the second operand. This immediate should only be 0 or 1 and indicates if the truncation loses precision. Also enhance an assert in SelectionDAG::getNode to flag this sort of problem in the future. llvm-svn: 281868
* [AVX-512] Stop lowering avx512_mask_sqrt intrinsics to ISD:FSQRT with a ↵Craig Topper2016-09-181-2/+2
| | | | | | second operand containing an X86 specific rounding mode encoding that doesn't belong. llvm-svn: 281867
* [X86] Fix typo in comment. NFCCraig Topper2016-09-181-1/+1
| | | | llvm-svn: 281862
* [AVX-512] Add memory load patterns for the legacy SSE scalar fp to integer ↵Craig Topper2016-09-181-1/+16
| | | | | | conversion intrinsics to be consistent across all intruction sets. llvm-svn: 281861
* [AVX-512] Remove COPY_TO_REGCLASS from a few patterns that already had the ↵Craig Topper2016-09-181-8/+8
| | | | | | correct register class. llvm-svn: 281860
* [X86][SSE] Improve recognition of uitofp conversions that can be performed ↵Simon Pilgrim2016-09-181-3/+9
| | | | | | | | | | | | | | as sitofp With D24253 we can now use SelectionDAG::SignBitIsZero with vector operations. This patch uses SelectionDAG::SignBitIsZero to recognise that a zero sign bit means that we can use a sitofp instead of a uitofp (which is not directly support on pre-AVX512 hardware). While AVX512 does provide support for uitofp, the conversion to sitofp should not cause any regressions. Differential Revision: https://reviews.llvm.org/D24343 llvm-svn: 281852
* [X86][SSE] Improve target shuffle mask extractionSimon Pilgrim2016-09-171-10/+14
| | | | | | Add ability to extract vXi64 'vzext_movl' masks on 32-bit targets llvm-svn: 281834
* [Hexagon] segv while processing SUnit with nullNodePtrRon Lieberman2016-09-171-0/+4
| | | | | | Added BoundaryNode check to isBestZeroLatency function. llvm-svn: 281825
* AMDGPU: Fix broken FrameIndex handlingMatt Arsenault2016-09-175-99/+19
| | | | | | | | | | | | | | | | | We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. llvm-svn: 281824
* AMDGPU: Rename spill operands to match real instructionMatt Arsenault2016-09-172-13/+13
| | | | llvm-svn: 281823
* AMDGPU: Push bitcasts through build_vectorMatt Arsenault2016-09-171-0/+27
| | | | | | | | This reduces the number of copies and reg_sequences when using fp constant vectors. This significantly reduces the code size in local-stack-alloc-bug.ll llvm-svn: 281822
* AMDGPU: Use i64 scalar compare instructionsMatt Arsenault2016-09-174-12/+45
| | | | | | VI added eq/ne for i64, so use them. llvm-svn: 281800
* AMDGPU/SI: Fix kernel argument ABI for HSATom Stellard2016-09-161-1/+2
| | | | | | | | | | | | Summary: i8, i16, and f16 values are not extended to 32-bit in the HSA kernel ABI. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24621 llvm-svn: 281789
* AMDGPU: Allow some control flow intrinsics to be CSEdMatt Arsenault2016-09-165-27/+80
| | | | | | | | | | | These clean up some unnecessary or instructions in cases with complex loops. In the original testcase I noticed this, the same or with exec was repeated 5 or 6 times in a row. With this only one is emitted or sometimes a copy. llvm-svn: 281786
* AMDGPU: Refactor kernel argument loweringTom Stellard2016-09-164-52/+109
| | | | | | | | | | | | | | | | | | | Summary: The main challenge in lowering kernel arguments for AMDGPU is determing the memory type of the argument. The generic calling convention code assumes that only legal register types can be stored in memory, but this is not the case for AMDGPU. This consolidates all the logic AMDGPU uses for deducing memory types into a single function. This will make it much easier to support different ABIs in the future. Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24614 llvm-svn: 281781
* AMDGPU: Use SOPK compare instructionsMatt Arsenault2016-09-167-51/+152
| | | | llvm-svn: 281780
* AMDGPU/SI: Add support for triples with the mesa3d operating systemTom Stellard2016-09-167-13/+22
| | | | | | | | | | | | | | Summary: mesa3d will use the same kernel calling convention as amdhsa, but it will handle everything else like the default 'unknown' OS type. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22783 llvm-svn: 281779
* Defer asm errors to post-statement failureNirav Dave2016-09-167-223/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommitting after fixing AsmParser initialization and X86 inline asm error cleanup. Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281762
* Actually remove the Mangler from the AsmPrinter and clean up the places it ↵Eric Christopher2016-09-167-15/+7
| | | | | | was "used" but not used. llvm-svn: 281749
* Fix a hidden use of grabbing the Mangler from the AsmPrinter and updateEric Christopher2016-09-161-4/+4
| | | | | | accordingly. llvm-svn: 281748
* [AArch64][GlobalISel] Add default regbank mapping for int<>FP.Ahmed Bougacha2016-09-161-0/+10
| | | | llvm-svn: 281739
* [AArch64][GlobalISel] Add default regbank mapping for G_FCMP.Ahmed Bougacha2016-09-161-0/+10
| | | | llvm-svn: 281738
* [AArch64][GlobalISel] Add default regbank mapping for FP ops.Ahmed Bougacha2016-09-161-1/+18
| | | | | | These should have all their operands - even scalars - go on FPR. llvm-svn: 281737
* [AArch64][GlobalISel] Add default regbank mappings for mixed-type ops.Ahmed Bougacha2016-09-161-18/+36
| | | | | | | | We used to only support instructions with same-type operands. Instead, use the per-register type information to map each operand more accurately. llvm-svn: 281734
* [mips] Fix previous revert r281726.Simon Dardis2016-09-161-36/+0
| | | | llvm-svn: 281729
* Place the lowered phi instruction(s) before the DEBUG_VALUE entryKeith Walker2016-09-161-1/+1
| | | | | | | | | | | | | | | | When a phi node is finally lowered to a machine instruction it is important that the lowered "load" instruction is placed before the associated DEBUG_VALUE entry describing the value loaded. Renamed the existing SkipPHIsAndLabels to SkipPHIsLabelsAndDebug to more fully describe that it also skips debug entries. Then used the "new" function SkipPHIsAndLabels when the debug information should not be skipped when placing the lowered "load" instructions so that it is placed before the debug entries. Differential Revision: https://reviews.llvm.org/D23760 llvm-svn: 281727
* Revert "[mips] Fix aui/daui/dahi/dati for MIPSR6"Simon Dardis2016-09-167-39/+47
| | | | | | This reverts r281724. Still need dsanders to accept this. llvm-svn: 281726
* [mips] Fix aui/daui/dahi/dati for MIPSR6Simon Dardis2016-09-167-11/+75
| | | | | | | | | | | | For compatiblity with binutils, define these instructions to take two registers with a 16bit unsigned immediate. Both of the registers have to be same for dahi and dati. Reviewers: vkalintiris, dsanders, zoran.jovanovic Differential Review: https://reviews.llvm.org/D21473 llvm-svn: 281724
OpenPOWER on IntegriCloud