summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [PostRAMachineSink] preserve CFGJun Bum Lim2018-03-281-2/+0
| | | | | | | | | | | | | | Summary: Mark CFG is preserved since this pass do not make any change in CFG. Reviewers: sebpop, mzolotukhin, mcrosier Reviewed By: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44845 llvm-svn: 328727
* [Hexagon] Add support for "new" circular buffer intrinsicsKrzysztof Parzyszek2018-03-281-0/+294
| | | | | | | | | | | | | | | | | | | These instructions have been around for a long time, but we haven't supported intrinsics for them. The "new" versions use the CSx register for the start of the buffer instead of the K field in the Mx register. We need to use pseudo instructions for these instructions until after register allocation. The problem is that these instructions allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for the CSx set-up until after register allocation when the Mx register has been fixed for the instruction. There is a related clang patch. Patch by Brendon Cahoon. llvm-svn: 328724
* [MachineOutliner] Simplify call outlining + require valid callee save info ↵Jessica Paquette2018-03-282-2/+2
| | | | | | | | | | for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719
* [X86][AVX2] Add shuffle test case from PR36933Simon Pilgrim2018-03-281-5/+43
| | | | llvm-svn: 328714
* Revert "Reapply "[DWARFv5] Emit file 0 to the line table.""Alexander Potapenko2018-03-282-34/+40
| | | | | | | | | | | | | | | This reverts commit r328676. Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang: $ cat t.c void foo() {} $ clang -no-integrated-as -c t.c -g /tmp/t-dcdec5.s: Assembler messages: /tmp/t-dcdec5.s:8: Error: file number less than one clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation) llvm-svn: 328699
* Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader"Tim Renouf2018-03-281-29/+0
| | | | | | | | | | | This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981. It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a release-with-asserts build. I will resubmit the change when I have fixed that. Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a llvm-svn: 328695
* [ARM] Support float literals under XOChristof Douma2018-03-281-83/+26
| | | | | | | | | | Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691
* [RegisterCoalescing] Don't move COPY if it would interfere with another valueMikael Holmen2018-03-281-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: RegisterCoalescer::removePartialRedundancy tries to hoist B = A from BB0/BB2 to BB1: BB1: ... BB0/BB2: ---- B = A; | ... | A = B; | |------- | It does so if a number of conditions are fulfilled. However, it failed to check if B was used by any of the terminators in BB1. Since we must insert B = A before the terminators (since it's not a terminator itself), this means that we could erroneously insert a new definition of B before a use of it. Reviewers: wmi, qcolombet Reviewed By: wmi Subscribers: MatzeB, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D44918 llvm-svn: 328689
* [AArch64] add ftrunc tests; NFCSanjay Patel2018-03-281-0/+47
| | | | | | As suggested in D44909. llvm-svn: 328683
* [PowerPC] add ftrunc vector tests; NFCSanjay Patel2018-03-281-0/+47
| | | | | | Baseline tests for vectors as suggested in D44909. llvm-svn: 328682
* Reapply "[DWARFv5] Emit file 0 to the line table."Paul Robinson2018-03-272-40/+34
| | | | | | | | | | | | | | | | DWARF v5 specifies that the root file (also given in the DW_AT_name attribute of the compilation unit DIE) should be emitted explicitly to the line table's list of files. This makes the line table more independent of the .debug_info section. Fixes the bug found by asan. Also XFAIL the new test for Darwin, which is stuck on DWARF v2, and fix up other tests so they stop failing on Windows. Last but not least, don't break "clang -g" of an assembler file that has .file directives in it. Differential Revision: https://reviews.llvm.org/D44054 llvm-svn: 328676
* [MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operandsJessica Paquette2018-03-271-0/+41
| | | | | | | | | If an ADRP appears with, say, a CPI operand, we shouldn't outline it. This moves the check for unsafe operands so that it occurs before the special-case for ADRPs. Also add a test for outlining ADRPs. llvm-svn: 328674
* [AMDGPU] For OS type AMDPAL, fixed scratch on compute shaderTim Renouf2018-03-271-0/+29
| | | | | | | | | | | | | | | | | | Summary: For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders). This commit fixes that to use offset 0x10 instead of offset 0 for a compute shader, per the PAL ABI spec. Reviewers: kzhuravl, nhaehnle, timcorringham Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D44468 Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f llvm-svn: 328673
* [DWARF] Suppress split line tables more carefully.Paul Robinson2018-03-272-0/+97
| | | | | | | | | | | | | | | | If a given split type unit does not have source locations, don't have it refer to the split line table. If no split type unit refers to the split line table, don't emit the line table at all. This will save a little space on rare occasions, but also refactors things a bit to improve which class is responsible for what. Responding to review comments on r326395. Differential Revision: https://reviews.llvm.org/D44220 llvm-svn: 328670
* [CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo ↵Tim Renouf2018-03-271-0/+17
| | | | | | | | | | | | | | | | | | source value Summary: Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing" broke -print-machineinstrs for us on AMDGPU, because we have custom pseudo source values, and MIR serialization does not implement that. This commit at least restores the functionality of -print-machineinstrs, even if it does not properly implement the missing MIR serialization functionality. Differential Revision: https://reviews.llvm.org/D44871 Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66 llvm-svn: 328668
* [X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classesSimon Pilgrim2018-03-274-6/+6
| | | | | | | | Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664
* AMDGPU: Fix not preserving CSR VGPR if used for SGPR spillsMatt Arsenault2018-03-271-0/+31
| | | | | | | | Before this was not done if the function had no calls in it. This is still a possible issue with any callable function, regardless of calls present. llvm-svn: 328659
* AMDGPU: Fix crash when MachinePointerInfo invalidMatt Arsenault2018-03-271-0/+82
| | | | | | | | The combine on a select of a load only triggers for addrspace 0, and discards the MachinePointerInfo. The conservative default needs to be used for this. llvm-svn: 328652
* AMDGPU: Fix register name format in testsMatt Arsenault2018-03-272-4/+4
| | | | | | | These were changed to match the asm output name a long time ago, although I think the old tablegenerated names still work. llvm-svn: 328651
* AMDGPU: Fix FP restore from being reordered with stack opsMatt Arsenault2018-03-271-1/+1
| | | | | | | | | | | | | | | | | In a function, s5 is used as the frame base SGPR. If a function is calling another function, during the call sequence it is copied to a preserved SGPR and restored. Before it was possible for the scheduler to move stack operations before the restore of s5, since there's nothing to associate a frame index access with the restore. Add an implicit use of s5 to the adjcallstack pseudo which ends the call sequence to preven this from happening. I'm not 100% satisfied with this solution, but I'm not sure what else would be better. llvm-svn: 328650
* [Hexagon] Implement TTI::shouldMaximizeVectorBandwidthKrzysztof Parzyszek2018-03-271-0/+47
| | | | llvm-svn: 328648
* [Power9] Fix the resource list for the COPY instruction.Stefan Pintilie2018-03-272-156/+112
| | | | | | | The COPY instruction was listed as a 4 cycle instruction. It is now listed correctly as a 2 cycle ALU instruction. llvm-svn: 328647
* Fix a reoccuring typo in load-combine testsArtur Pilipenko2018-03-273-5/+5
| | | | | | | | | | | %tmp = bitcast i32* %arg to i8* %tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0 - %tmp2 = load i8, i8* %tmp, align 1 + %tmp2 = load i8, i8* %tmp1, align 1 This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended. llvm-svn: 328642
* Use .set instead of = when printing assignment in assembly outputKrzysztof Parzyszek2018-03-2730-99/+99
| | | | | | | | | On Hexagon "x = y" is a syntax used in most instructions, and is not treated as a directive. Differential Revision: https://reviews.llvm.org/D44256 llvm-svn: 328635
* [X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler classSimon Pilgrim2018-03-271-1/+1
| | | | llvm-svn: 328620
* [PowerPC] Secure PLT supportStrahinja Petrovic2018-03-271-0/+4
| | | | | | | | This patch supports secure PLT mode for PowerPC 32 architecture. Differential Revision: https://reviews.llvm.org/D42112 llvm-svn: 328617
* [x86] add RUN for target before roundss; NFCSanjay Patel2018-03-271-30/+232
| | | | llvm-svn: 328601
* [x86] add tests for ftrunc; NFCSanjay Patel2018-03-262-12/+404
| | | | llvm-svn: 328592
* Fix newlines. NFCI.Simon Pilgrim2018-03-264-18305/+18305
| | | | llvm-svn: 328583
* [X86] Add WriteCRC32 scheduler classSimon Pilgrim2018-03-261-6/+6
| | | | | | | | Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis. Differential Revision: https://reviews.llvm.org/D44647 llvm-svn: 328582
* Use local symbols for creating .stack-size.Rafael Espindola2018-03-263-7/+14
| | | | llvm-svn: 328581
* [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32Reid Kleckner2018-03-268-2/+143
| | | | | | | | | | | | | | | | | | | | | | | Summary: Re-lands r328386 and r328443, reverting r328482. Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small parameters in i8 and i16 do not end up in the SysV register parameters (EDI, ESI, etc). I added tests for how we receive small parameters, since that is the important part. It's always safe to store more bytes than will be read, but the assumptions you make when loading them are what really matter. I also tested this by self-hosting clang and it passed tests on win64. Reviewers: mstorsjo, hans Subscribers: hiraditya, mstorsjo, llvm-commits Differential Revision: https://reviews.llvm.org/D44900 llvm-svn: 328570
* [X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes ↵Simon Pilgrim2018-03-264-18305/+18305
| | | | | | | | | | | | (PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566
* [Hexagon] Add more lit testsKrzysztof Parzyszek2018-03-2612-0/+1367
| | | | llvm-svn: 328561
* [Power9]Legalize and emit code for quad-precision convert from double-precisionLei Huang2018-03-261-1/+29
| | | | | | | | | Legalize and emit code for quad-precision floating point operation xscvdpqp and add option to guard the quad precision operation support. Differential Revision: https://reviews.llvm.org/D44746 llvm-svn: 328558
* [PowerPC] Infrastructure work. Implement getting the opcode for a spill in ↵Stefan Pintilie2018-03-261-8/+8
| | | | | | | | | | | one place. A new function getOpcodeForSpill should now be the only place to get the opcode for a given spilled register. Differential Revision: https://reviews.llvm.org/D43086 llvm-svn: 328556
* [X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costsSimon Pilgrim2018-03-262-16/+16
| | | | | | We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR..... llvm-svn: 328551
* [Pipeliner] Add missing loop carried dependencesKrzysztof Parzyszek2018-03-261-0/+54
| | | | | | | | | | | | | | | | | | | | | | The pipeliner is not adding a dependence edge for a loop carried dependence, and ends up scheduling a load from iteration n prior to an aliased store in iteration n-1. The code that adds the loop carried dependences in the pipeliner doesn't check if the memory objects for loads and stores are "identified" (i.e., distinct) objects. If they are not, then the code that adds the dependences needs to be conservative. The objects can be used to check dependences only when they are distinct objects. The code that checks for loop carried dependences has been updated to classify loads and stores that are not identified as "unknown" values. A store with an "unknown" value can potentially create a loop carried dependence with any pending load. Patch by Brendon Cahoon. llvm-svn: 328547
* [Pipeliner] Use latency to compute RecMIIKrzysztof Parzyszek2018-03-262-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch contains severals changes needed to pipeline an example that was transformed so that a Phi with a subreg is converted to copies. The pipeliner wasn't working for a couple of reasons. - The RecMII was 3 instead of 2 due to the extra copies. - Copy instructions contained a latency of 1. - The node order algorithm was not choosing the best "bottom" node, which caused an instruction to be scheduled that had a predecessor and successor already scheduled. - Updated the Hexagon Machine Scheduler to check if the node is latency bound when adding the cost for a 0-latency dependence. The RecMII was 3 because the computation looks at the number of nodes in the recurrence. The extra copy is an extra node but it shouldn't increase the latency. The new RecMII computation looks at the latency of the instructions in the recurrence. We changed the latency of the dependence of a copy to 0. The latency computation for the copy also checks the use of the copy (similar to a reg_sequence). The node order algorithm was not choosing the last instruction in the recurrence for a bottom up traversal. This was when the last instruction is a copy. A check was added when choosing the instruction to check for NodeNum if the maxASAP is the same. This means that the scheduler will not end up with another node in the recurrence that has both a predecessor and successor already scheduled. The cost computation in Hexagon Machine Scheduler adds cost when an instruction can be packetized with a zero-latency instruction. We should only do this if the schedule is latency bound. Patch by Brendon Cahoon. llvm-svn: 328542
* [X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costsSimon Pilgrim2018-03-261-8/+8
| | | | llvm-svn: 328541
* [Pipeliner] Fix assert caused by pipeliner serializationKrzysztof Parzyszek2018-03-261-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The pipeliner is asserting because the serialization step that occurs at the end is deleting an instruction. The assert occurs later on because there is a use without a definition. The problem occurs when an instruction defines a value used by a REQ_SEQUENCE and that value is used by a COPY instruction. The latencies between these instructions are zero, so they are put in to the same packet. The serialization code is unable to handle this correctly, and ends up putting the REG_SEQUENCE before its definition. There is special code in the serialization step that attempts to handle zero-cost instructions (phis, copy, reg_sequence) differently than regular instructions. Unfortunately, this means the order does not come out correct. This patch simplifies the code by changing the seperate steps for handling zero-cost and regular instructions. Only phis are handled separate now, since they should occurs first. Then, this patch adds checks to make use the MoveUse is set to the smallest value if there are multiple uses in a cycle. Patch by Brendon Cahoon. llvm-svn: 328540
* [Pipeliner] Fix check for order dependences when finalizing instructionsKrzysztof Parzyszek2018-03-261-0/+51
| | | | | | | | | | | | | | | | | The code in orderDepdences that looks at the order dependences between instructions was processing all the successor and predecessor order dependences. However, we really only want to check for an order dependence for instructions scheduled in the same cycle. Also, fixed how the pipeliner handles output dependences. An output dependence is also a potential loop carried dependence. The pipeliner didn't handle this case properly so an invalid schedule could be created that allowed an output dependence to be scheduled in the next iteration at the same cycle. Patch by Brendon Cahoon. llvm-svn: 328516
* [Pipeliner] Fix in the pipeliner phi reuse codeKrzysztof Parzyszek2018-03-261-0/+44
| | | | | | | | | | | When the definition of a phi is used by a phi in the next iteration, the pipeliner was assuming that the definition is processed first. Because of the assumption, an incorrect phi name was used. This patch has a check to see if the phi definition has been processed already. Patch by Brendon Cahoon. llvm-svn: 328510
* [Pipeliner] Correctly update memoperands in the epilogKrzysztof Parzyszek2018-03-261-0/+102
| | | | | | | | | | | | | | | | The pipeliner needs to be conservative when updating the memoperands of instructions in the epilog. Previously, the pipeliner was changing the offset of the memoperand based upon the scheduling stage. However, that is incorrect when control flow branches around the kernel code. The bug enabled a load and store to the same stack offset to be swapped. This patch fixes the bug by updating the size of the memoperands to be UINT_MAX. This conservative value means that dependences will be created between other loads and stores. Patch by Brendon Cahoon. llvm-svn: 328508
* [Hexagon] Give priority to post-incremementing memory accesses in LSRKrzysztof Parzyszek2018-03-264-43/+47
| | | | llvm-svn: 328506
* [X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costsSimon Pilgrim2018-03-262-32/+32
| | | | | | | | Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) This also adds missing vcvttss2si tests llvm-svn: 328505
* [X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costsSimon Pilgrim2018-03-261-12/+12
| | | | | | These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults. llvm-svn: 328501
* [X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costsSimon Pilgrim2018-03-261-8/+8
| | | | | | The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps llvm-svn: 328497
* [X86][Btver2] Double the AGU and schedule pipe resources for YMMSimon Pilgrim2018-03-261-15/+15
| | | | | | Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model. llvm-svn: 328491
* Revert r328386 "[X86] Fix Windows `i1 zeroext` conventions to use i8 instead ↵Hans Wennborg2018-03-267-76/+2
| | | | | | | | | | | | | | | of i32" This broke Chromium (see crbug.com/825748). It looks like mstorsjo's follow-up patch at D44876 fixes this, but let's revert back to green for now until that's ready to land. (Also reverts r328443.) > Both GCC and MSVC only look at the low byte of a boolean when it is > passed. llvm-svn: 328482
OpenPOWER on IntegriCloud