summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [Hexagon] Assertion failure in HexagonSubtarget.cppKrzysztof Parzyszek2018-03-261-7/+7
| | | | | | | | In restoreLatency, replace range-for loop with std::find. Patch by Jyotsna Verma. llvm-svn: 328574
* [X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costsSimon Pilgrim2018-03-261-0/+10
| | | | | | Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) llvm-svn: 328573
* [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32Reid Kleckner2018-03-262-6/+16
| | | | | | | | | | | | | | | | | | | | | | | Summary: Re-lands r328386 and r328443, reverting r328482. Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small parameters in i8 and i16 do not end up in the SysV register parameters (EDI, ESI, etc). I added tests for how we receive small parameters, since that is the important part. It's always safe to store more bytes than will be read, but the assumptions you make when loading them are what really matter. I also tested this by self-hosting clang and it passed tests on win64. Reviewers: mstorsjo, hans Subscribers: hiraditya, mstorsjo, llvm-commits Differential Revision: https://reviews.llvm.org/D44900 llvm-svn: 328570
* [X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes ↵Simon Pilgrim2018-03-2611-125/+93
| | | | | | | | | | | | (PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566
* [XCore] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-03-262-3/+3
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: dblaikie, RKSimon, robertlytton Reviewed By: robertlytton Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44875 llvm-svn: 328564
* [Power9]Legalize and emit code for quad-precision convert from double-precisionLei Huang2018-03-262-5/+13
| | | | | | | | | Legalize and emit code for quad-precision floating point operation xscvdpqp and add option to guard the quad precision operation support. Differential Revision: https://reviews.llvm.org/D44746 llvm-svn: 328558
* [PowerPC] Infrastructure work. Implement getting the opcode for a spill in ↵Stefan Pintilie2018-03-268-509/+621
| | | | | | | | | | | one place. A new function getOpcodeForSpill should now be the only place to get the opcode for a given spilled register. Differential Revision: https://reviews.llvm.org/D43086 llvm-svn: 328556
* [AMDGPU] Improve disassembler error handlingTim Corringham2018-03-261-1/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: llvm-objdump now disassembles unrecognised opcodes as data, using the .long directive. We treat unrecognised opcodes as being 32 bit values, so move along 4 bytes rather than the single byte which previously resulted in a cascade of bogus disassembly following an unrecognised opcode. While no solution can always disassemble code that contains embedded data correctly this provides a significant improvement. The disassembler will now cope with an arbitrary length section as it no longer truncates it to a multiple of 4 bytes, and will use the .byte directive for trailing bytes. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44685 llvm-svn: 328553
* [X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costsSimon Pilgrim2018-03-261-4/+17
| | | | | | We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR..... llvm-svn: 328551
* Remove an unneeded (& mislayered) include from ↵David Blaikie2018-03-261-1/+0
| | | | | | Target/TargetLoweringObjectFile on a CodeGen header llvm-svn: 328549
* Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen ↵David Blaikie2018-03-261-1/+0
| | | | | | header llvm-svn: 328548
* [Pipeliner] Use latency to compute RecMIIKrzysztof Parzyszek2018-03-262-15/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch contains severals changes needed to pipeline an example that was transformed so that a Phi with a subreg is converted to copies. The pipeliner wasn't working for a couple of reasons. - The RecMII was 3 instead of 2 due to the extra copies. - Copy instructions contained a latency of 1. - The node order algorithm was not choosing the best "bottom" node, which caused an instruction to be scheduled that had a predecessor and successor already scheduled. - Updated the Hexagon Machine Scheduler to check if the node is latency bound when adding the cost for a 0-latency dependence. The RecMII was 3 because the computation looks at the number of nodes in the recurrence. The extra copy is an extra node but it shouldn't increase the latency. The new RecMII computation looks at the latency of the instructions in the recurrence. We changed the latency of the dependence of a copy to 0. The latency computation for the copy also checks the use of the copy (similar to a reg_sequence). The node order algorithm was not choosing the last instruction in the recurrence for a bottom up traversal. This was when the last instruction is a copy. A check was added when choosing the instruction to check for NodeNum if the maxASAP is the same. This means that the scheduler will not end up with another node in the recurrence that has both a predecessor and successor already scheduled. The cost computation in Hexagon Machine Scheduler adds cost when an instruction can be packetized with a zero-latency instruction. We should only do this if the schedule is latency bound. Patch by Brendon Cahoon. llvm-svn: 328542
* [X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costsSimon Pilgrim2018-03-261-0/+14
| | | | llvm-svn: 328541
* [X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of ↵Simon Pilgrim2018-03-261-20/+11
| | | | | | JALU0 for GPR PRF write) llvm-svn: 328536
* [Hexagon] Give priority to post-incremementing memory accesses in LSRKrzysztof Parzyszek2018-03-262-1/+8
| | | | llvm-svn: 328506
* [X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costsSimon Pilgrim2018-03-261-0/+12
| | | | | | | | Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) This also adds missing vcvttss2si tests llvm-svn: 328505
* [X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costsSimon Pilgrim2018-03-261-4/+8
| | | | | | These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults. llvm-svn: 328501
* [X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costsSimon Pilgrim2018-03-261-0/+16
| | | | | | The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps llvm-svn: 328497
* AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classesNicolai Haehnle2018-03-266-67/+56
| | | | | | | Differential revision: https://reviews.llvm.org/D44820 Change-Id: I732979e2964006aa15d78a333d8886e6855f319a llvm-svn: 328496
* [X86][Btver2] Double the AGU and schedule pipe resources for YMMSimon Pilgrim2018-03-261-31/+31
| | | | | | Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model. llvm-svn: 328491
* Revert r328386 "[X86] Fix Windows `i1 zeroext` conventions to use i8 instead ↵Hans Wennborg2018-03-262-15/+5
| | | | | | | | | | | | | | | of i32" This broke Chromium (see crbug.com/825748). It looks like mstorsjo's follow-up patch at D44876 fixes this, but let's revert back to green for now until that's ready to land. (Also reverts r328443.) > Both GCC and MSVC only look at the low byte of a boolean when it is > passed. llvm-svn: 328482
* [ARM] Simplify constructing the ARMArchFeature string. NFC.Martin Storsjo2018-03-261-12/+9
| | | | | | Differential Revision: https://reviews.llvm.org/D44819 llvm-svn: 328478
* [X86] Fix the SchedRW for intrinsic register form of SQRT/RCP/RSQRT.Craig Topper2018-03-261-2/+2
| | | | llvm-svn: 328474
* [X86] Merge the SSE and AVX versions of fp divs and sqrts in the ↵Craig Topper2018-03-265-141/+81
| | | | | | | | SandyBridge/Haswell/Broadwell/Skylake scheduler models. I've used Agner's data as best I could to get the values to converge on. llvm-svn: 328473
* [X86] Add itinerary to intrinsic version of sqrtss, rcpss, and rsqrtss ↵Craig Topper2018-03-261-2/+2
| | | | | | instructions. llvm-svn: 328472
* [X86] Correct the itineraries for the dot production instructions.Craig Topper2018-03-261-2/+2
| | | | llvm-svn: 328471
* [X86] Use the same itinerary for VCVTDQ2PD as the SSE version so that the ↵Craig Topper2018-03-261-8/+10
| | | | | | generated scheduler classes will merge. llvm-svn: 328470
* [X86] Swap the itineraries on the memory and register forms of CVTDQ2PD.Craig Topper2018-03-261-2/+2
| | | | | | They were backwards. llvm-svn: 328469
* [X86] Give VMOVSX/ZX the same itinerary as the SSE version so they'll reuse ↵Craig Topper2018-03-261-11/+6
| | | | | | the same generated scheduler class. llvm-svn: 328468
* [X86] Give vpmsadbw the same itinerary as the SSE version so they'll be able ↵Craig Topper2018-03-251-7/+2
| | | | | | to share the same generated scheduler class. llvm-svn: 328466
* [X86] Move (v)movss to port 5 only for Skylake. Move (v)movups/d to port 015 ↵Craig Topper2018-03-252-11/+11
| | | | | | | | for Skylake. This matches Agner's data and is consistent with what the EVEX instructions were doing on SKX. llvm-svn: 328465
* [X86] Use WriteResPair for WriteIDiv to cleanup sched defs. NFCI.Simon Pilgrim2018-03-254-37/+10
| | | | llvm-svn: 328460
* [X86][SkylakeClient] Fix missing commaSimon Pilgrim2018-03-251-1/+1
| | | | llvm-svn: 328458
* [ARM] Remove sched model instregex entries that don't match any instructions ↵Simon Pilgrim2018-03-253-41/+39
| | | | | | | | (D44687) Reviewed by @javed.absar llvm-svn: 328457
* [X86] Add missing full stop to comment. NFCI.Simon Pilgrim2018-03-251-1/+1
| | | | llvm-svn: 328456
* [X86][SkylakeClient] Fix a set of regular expressions that were checking for ↵Craig Topper2018-03-251-18/+18
| | | | | | | | optionally starting with 'Y' instead of 'V' These bad regexs were introduced by r328435 llvm-svn: 328454
* [X86][MMX] MOVQ2DQ/MOVDQ2Q are better described as WriteVecMove than WriteMoveSimon Pilgrim2018-03-251-1/+1
| | | | | | Not that it makes a difference to current cost values, but will when we try to better model GPR-SIMD transfer costs llvm-svn: 328453
* [X86][SkylakeServer] Merge multiple instregex. NFCISimon Pilgrim2018-03-251-7/+7
| | | | llvm-svn: 328452
* [X86] Update cost model for Goldmont. Add fsqrt costs for SilvermontCraig Topper2018-03-252-15/+49
| | | | | | | | | | | | | | Add fdiv costs for Goldmont using table 16-17 of the Intel Optimization Manual. Also add overrides for FSQRT for Goldmont and Silvermont. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44644 llvm-svn: 328451
* [X86] Add the ability to override memory folding latency to schedules and ↵Simon Pilgrim2018-03-256-29/+35
| | | | | | | | | | add 1uop for memory folds for Intel models The Intel models need an extra 1uop for memory folded instructions, plus a lot of instructions take a non-default memory latency which should allow us to use the multiclass a lot more to tidy things up. Differential Revision: https://reviews.llvm.org/D44840 llvm-svn: 328446
* [X86] Consistently prefix all defs in X86ScheduleSLM.td with 'SLM'.Craig Topper2018-03-251-79/+79
| | | | llvm-svn: 328444
* [X86] Update a partially stale comment, since SVN r328386. NFC.Martin Storsjo2018-03-241-1/+1
| | | | llvm-svn: 328443
* [X86][SkylakeClient] Merge xmm/ymm instructions instregex entries to reduce ↵Simon Pilgrim2018-03-241-1095/+478
| | | | | | regex matches to reduce compile time llvm-svn: 328435
* [X86][Broadwell] Merge xmm/ymm instructions instregex entries to reduce ↵Simon Pilgrim2018-03-241-1119/+489
| | | | | | regex matches to reduce compile time llvm-svn: 328434
* [RISCV] Use init_array instead of ctors for RISCV target, by defaultMandeep Singh Grang2018-03-244-1/+47
| | | | | | | | | | | | | | | | | | | | | Summary: LLVM defaults to the newer .init_array/.fini_array scheme for static constructors rather than the less desirable .ctors/.dtors (the UseCtors flag defaults to false). This wasn't being respected in the RISC-V backend because it fails to call TargetLoweringObjectFileELF::InitializeELF with the the appropriate flag for UseInitArray. This patch fixes this by implementing RISCVELFTargetObjectFile and overriding its Initialize method to call InitializeELF(TM.Options.UseInitArray). Reviewers: asb, apazos Reviewed By: asb Subscribers: mgorny, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, llvm-commits Differential Revision: https://reviews.llvm.org/D44750 llvm-svn: 328433
* [X86][Haswell] Merge xmm/ymm instructions instregex entries to reduce regex ↵Simon Pilgrim2018-03-241-318/+119
| | | | | | matches to reduce compile time llvm-svn: 328432
* [X86][SandyBridge] Merge xmm/ymm instructions instregex entries to reduce ↵Simon Pilgrim2018-03-241-158/+79
| | | | | | regex matches to reduce compile time llvm-svn: 328431
* [Hexagon] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-03-247-9/+9
| | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44857 llvm-svn: 328430
* [AMDGPU] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-03-241-1/+1
| | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Reviewers: tstellar, RKSimon, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44856 llvm-svn: 328429
* [X86] Add a new disassembler opcode map for 3DNow. Stop treating 3DNow as an ↵Craig Topper2018-03-242-43/+18
| | | | | | | | attribute. This reduces the size of llvm-mc by at least 150k since we no longer have to multiply the attribute across 7 tables. llvm-svn: 328416
OpenPOWER on IntegriCloud