summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [CodeGen] Add dependency printerEvandro Menezes2017-07-1212-52/+52
| | | | | | | | Add SDep printer to make debugging sessions more productive. Differential revision: https://reviews.llvm.org/D35144 llvm-svn: 307799
* [X86/FastIsel] Fall-back to SelectionDAG when lowering soft-floats.Davide Italiano2017-07-121-0/+15
| | | | | | | | | | FastIsel can't handle them, so we would end up crashing during register class selection. Fixes PR26522. Differential Revision: https://reviews.llvm.org/D35272 llvm-svn: 307797
* Add element atomic memmove intrinsicDaniel Neilson2017-07-121-0/+63
| | | | | | | | | | | | | | Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size. Reviewers: eli.friedman, reames, mkazantsev, skatkov Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34884 llvm-svn: 307796
* [X86][SSE] Fix file check prefix warning breaking buildbotsSimon Pilgrim2017-07-122-4/+4
| | | | llvm-svn: 307790
* Make shell redirection construct portableKamil Rytarowski2017-07-125-5/+5
| | | | | | | | | | | | | | | | | | | | | | Summary: NetBSD shell sh(1) does not support ">& /dev/null" construct. This is bashism. The portable and POSIX solution is to use: "> /dev/null 2>&1". This change fixes 22 Unexpected Failures on NetBSD/amd64 for the "check-llvm" target. Sponsored by <The NetBSD Foundation> Reviewers: joerg, dim, rnk Reviewed By: joerg, rnk Subscribers: rnk, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35277 llvm-svn: 307789
* [ARM] Adjust ifcvt heuristic for the diamond ifcvt caseJohn Brawn2017-07-121-9/+13
| | | | | | | | | When we have a diamond ifcvt the fallthough block will have a branch at the end of it that disappears when predicated, so discount it from the predication cost. Differential Revision: https://reviews.llvm.org/D34952 llvm-svn: 307788
* [X86][SSE] Add 512-bit (iX bitcast(vXi1)) test casesSimon Pilgrim2017-07-122-0/+3245
| | | | | | Improves test coverage for pre-AVX512 targets as well llvm-svn: 307783
* [ARM] GlobalISel: Select s64 G_FCMPDiana Picus2017-07-121-0/+605
| | | | | | | Very similar to how we select s32 G_FCMP, the only thing that is different is the exact opcodes that we use. llvm-svn: 307763
* [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess.Michael Zuckerman2017-07-121-83/+143
| | | | | | Adding base test for AVX512 llvm-svn: 307761
* Specify complete target triple in testMatthias Braun2017-07-121-2/+2
| | | | | | This should fix the problems on the greendragon build. llvm-svn: 307747
* Enhance synchscope representationKonstantin Zhuravlyov2017-07-117-272/+389
| | | | | | | | | | | | | | | | | | | | | | | | | | | OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this replaces *singlethread* keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
* [x86] auto-generate full checks; NFCSanjay Patel2017-07-111-79/+126
| | | | llvm-svn: 307718
* reverting 307677.Michael Zuckerman2017-07-111-144/+67
| | | | llvm-svn: 307698
* [PPC] Fix one test case regression for patch https://reviews.llvm.org/D34337.Tony Jiang2017-07-111-1/+0
| | | | llvm-svn: 307691
* [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess. Michael Zuckerman2017-07-111-67/+144
| | | | | | | Base test for avx512 adding new base test to trunk befor commit change on the test llvm-svn: 307677
* [Hexagon] Do not rely on callee-saved info in hasFPKrzysztof Parzyszek2017-07-112-0/+165
| | | | llvm-svn: 307675
* [PPC] Fix two bugs in frame lowering.Tony Jiang2017-07-113-4/+40
| | | | | | | | | | | 1. The available program storage region of the red zone to compilers is 288 bytes rather than 244 bytes. 2. The formula for negative number alignment calculation should be y = x & ~(n-1) rather than y = (x + (n-1)) & ~(n-1). Differential Revision: https://reviews.llvm.org/D34337 llvm-svn: 307672
* [Hexagon] Add support for nontemporal loads and stores on HVXKrzysztof Parzyszek2017-07-111-0/+28
| | | | | | | | Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D35104 llvm-svn: 307671
* [ARM] GlobalISel: Tighten G_FCMP selection test. NFCDiana Picus2017-07-111-68/+68
| | | | | | | | Use CHECK-NEXT for the comparison sequence, to make sure we don't get any unexpected instructions in the middle of our flag manipulation efforts. llvm-svn: 307656
* [X86][AVX512] regenerate avx512-insert-extract.llGuy Blank2017-07-111-3/+3
| | | | llvm-svn: 307654
* [ARM] GlobalISel: Add reg mapping for s64 G_FCMPDiana Picus2017-07-111-0/+29
| | | | | | Map the result into GPR and the operands into FPR. llvm-svn: 307653
* [ARM] GlobalISel: Tighten legalizer tests. NFCDiana Picus2017-07-113-0/+110
| | | | | | | | | | | | | | Make sure that all the legalizer tests where the original instruction needs to be removed check for the removal. We do this by adding CHECK-NOT lines before and after the replacement sequence. This won't catch pathological cases where the instruction remains somewhere in the middle of the instruction sequence that's supposed to replace it, but hopefully that won't occur in practice (since ideally we'd be setting the insert point for the new instruction sequence either before or after the original instruction and not fiddle with it while building the sequence). llvm-svn: 307647
* [ARM] GlobalISel: Fix oversight in G_FCMP legalizationDiana Picus2017-07-111-0/+8
| | | | | | | We used to forget to erase the original instruction when replacing a G_FCMP true/false. Fix this bug and make sure the tests check for it. llvm-svn: 307639
* [globalisel][tablegen] Correct matching of intrinsic ID's.Daniel Sanders2017-07-111-0/+38
| | | | | | | | | | | | TreePatternNode considers them to be plain integers but MachineInstr considers them to be a distinct kind of operand. The tweak to AArch64InstrInfo.td to produce a simple test case is a NFC for everything except GlobalISelEmitter (confirmed by diffing the tablegenerated files). GlobalISelEmitter is currently unable to infer the type of operands in the Dst pattern from the operands in the Src pattern. llvm-svn: 307634
* [ARM] GlobalISel: Legalize s64 G_FCMPDiana Picus2017-07-111-0/+878
| | | | | | Same as the s32 version, for both hard and soft float. llvm-svn: 307633
* Revert Revert [MBP] do not rotate loop if it creates extra branchSerguei Katkov2017-07-112-4/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a second attempt to land this patch. The first one resulted in a crash of clang sanitizer buildbot. The fix is here and regression test is added. This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true Best Exit has next element in chain as successor. Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur, sammccall, chandlerc Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34745 llvm-svn: 307631
* [GlobalISel][X86] Use correct AND instructions.Igor Breger2017-07-111-1/+1
| | | | | | AND8ri8 not supported in 64bit. llvm-svn: 307630
* [CGP] Relax a bit restriction for optimizeMemoryInst to extend scopeSerguei Katkov2017-07-111-0/+25
| | | | | | | | | | | | | | | | | | CodeGenPrepare::optimizeMemoryInst contains a check that we do nothing if all instructions combining the address for memory instruction is in the same block as memory instruction itself. However if any of these instruction are placed after memory instruction then address calculation will not be folded to memory instruction. The added test case shows an example. Reviewers: loladiro, spatel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34862 llvm-svn: 307628
* [AVR] Use the generic branch relaxerDylan McKay2017-07-114-7/+104
| | | | llvm-svn: 307617
* Revert "[DAG] Improve Aliasing of operations to static alloca"Matthias Braun2017-07-1023-152/+119
| | | | | | | | | Reverting as it breaks tramp3d-v4 in the llvm test-suite. I added some comments to https://reviews.llvm.org/D33345 about it. This reverts commit r307546. llvm-svn: 307589
* AMDGPU: Allow SIShrinkInstructions to fold FrameIndexesMatt Arsenault2017-07-102-4/+163
| | | | llvm-svn: 307576
* AMDGPU: Allow SIShrinkInstructions to work in non-SSAMatt Arsenault2017-07-1068-430/+470
| | | | | | | | Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. llvm-svn: 307575
* [Hexagon] Fix check for HMOTF_ConstExtend operand flagKrzysztof Parzyszek2017-07-101-0/+24
| | | | | | This fixes https://llvm.org/PR33718. llvm-svn: 307566
* [Hexagon] Handle Hexagon-specific machine operand target flags in MIRKrzysztof Parzyszek2017-07-101-0/+36
| | | | llvm-svn: 307564
* [PPC CodeGen] Expand the bitreverse.i64 intrinsic.Tony Jiang2017-07-102-0/+161
| | | | | | | Differential Revision: https://reviews.llvm.org/D34908 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307563
* [PowerPC] Reduce register pressure by not materializing a constant just for ↵Lei Huang2017-07-104-12/+49
| | | | | | | | | | | | | | | | | | | | | | | | use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553
* [X86] Model 256-bit AVX instructions in the AMD Jaguar scheduler Part-1 ↵Andrew V. Tischenko2017-07-103-161/+161
| | | | | | | | | | | (PR28573). The new version of the model is definitely faster. Differential Revision: https://reviews.llvm.org/D35198 llvm-svn: 307552
* [DAG] Improve Aliasing of operations to static allocaNirav Dave2017-07-1023-119/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 307546
* This patch completely replaces the scheduling information for the ↵Gadi Haber2017-07-1012-896/+896
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target. The SandyBridge architects have provided us with a more accurate information about each instruction latency, number of uOPs and used ports and I used it to replace the existing estimated SNB instructions scheduling and to add missing scheduling information. Please note that the patch extensively affects the X86 MC instr scheduling for SNB. Also note that this patch will be followed by additional patches for the remaining target architectures HSW, IVB, BDW, SKL and SKX. The updated and extended information about each instruction includes the following details: •static latency of the instruction •number of uOps from which the instruction consists of •all ports used by the instruction's' uOPs For example, the following code dictates that instructions, ADC64mr, ADC8mr, SBB64mr, SBB8mr have a static latency of 9 cycles. Each of these instructions is decoded into 6 micro operations which use ports 4, ports 2 or 3 and port 0 and ports 0 or 1 or 5: def SBWriteResGroup94 : SchedWriteRes<[SBPort4,SBPort23,SBPort0,SBPort015]> { let Latency = 9; let NumMicroOps = 6; let ResourceCycles = [1,2,2,1]; } def: InstRW<[SBWriteResGroup94], (instregex "ADC64mr")>; def: InstRW<[SBWriteResGroup94], (instregex "ADC8mr")>; def: InstRW<[SBWriteResGroup94], (instregex "SBB64mr")>; def: InstRW<[SBWriteResGroup94], (instregex "SBB8mr")>; Note that apart for the header, most of the X86SchedSandyBridge.td file was generated by a script. Reviewers: zvi, chandlerc, RKSimon, m_zuckerman, craig.topper, igorb Differential Revision: https://reviews.llvm.org/D35019#inline-304691 llvm-svn: 307529
* [GlobalISel][X86] Support G_LOAD/G_STORE i1.Igor Breger2017-07-103-0/+52
| | | | | | | | | | | | | | Summary: Support G_LOAD/G_STORE i1. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35178 llvm-svn: 307527
* [GlobalISel][X86] extend G_ZEXT support.Igor Breger2017-07-103-1/+270
| | | | | | | | | | | | | | | | | Summary: Mark G_ZEXT/G_SEXT i1 to i8/i16, i8 to i16 as legal. Support G_ZEXT i1 to i8/i16 instruction selection ( C++ code). This patch requred to support G_LOAD/G_STORE i1. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D35177 llvm-svn: 307526
* [X86] Relax an assertion when legalizing vector types.Davide Italiano2017-07-091-0/+16
| | | | | | | | | | | | | | WidenVSELECTAndMask can fold (and it folds in this case) so we get a BUILD_VECTOR of constants as mask. convertMask() seems to work fine when the input is a vector of constants, and we still need to call it to extend/add elements at the end. but the current code just asserts on anything but a SETCC or AND/OR/XOR of 2xSETCC. This change was discussed briefly with Simon Pilgrim, who also suggests we might consider dropping this assertion in the future. Fixes PR33715. llvm-svn: 307508
* [AVR] Fix test errors due to tied operands not matchingDylan McKay2017-07-095-7/+7
| | | | | | Broken due to r307259. llvm-svn: 307503
* Handle ConstantExpr correctly in SelectionDAGBuilderSimon Pilgrim2017-07-091-0/+18
| | | | | | | | | | | | This change fixes a bug in SelectionDAGBuilder::visitInsertValue and SelectionDAGBuilder::visitExtractValue where constant expressions (InsertValueConstantExpr and ExtractValueConstantExpr) would be treated as non-constant instructions (InsertValueInst and ExtractValueInst). This bug resulted in an incorrect memory access, which manifested as an assertion failure in SDValue::SDValue. Fixes PR#33094. Submitted on behalf of @Praetonus (Benoit Vey) Differential Revision: https://reviews.llvm.org/D34538 llvm-svn: 307502
* [X86][AVX512] Regenerate AVX512VL comparison tests. Simon Pilgrim2017-07-092-4686/+47145
| | | | | | Show poor codegen on KNL targets as mentioned on D35179 llvm-svn: 307500
* [GlobalISel][X86] Add legalizer tests for G_LOAD/G_STORE operations. NFC.Igor Breger2017-07-091-0/+100
| | | | llvm-svn: 307494
* [FastISel] fix a fallback diagnostic.Igor Breger2017-07-091-1/+18
| | | | | | | | | | | | | | Summary: FastISel was marked as failed in case instruction selection succeeded. Reviewers: qcolombet, zvi, rovka, ab Reviewed By: zvi Subscribers: javed.absar, ab, qcolombet, bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D34438 llvm-svn: 307489
* fix trivial typos; NFCHiroshi Inoue2017-07-091-1/+1
| | | | | | sucessor -> successor llvm-svn: 307488
* [x86] add SBB optimization for SETBE (ule) condition codeSanjay Patel2017-07-081-4/+2
| | | | | | | | | | | | | | | | | | | | | | | x86 scalar select-of-constants (Cond ? C1 : C2) combining/lowering is a mess with missing optimizations. We handle some patterns, but miss logical variants. To clean that up, we should convert all select-of-constants to logic/math and enhance the combining for the expected patterns from that. Selecting 0 or -1 needs extra attention to produce the optimal code as shown here. Attempt to verify that all of these IR forms are logically equivalent: http://rise4fun.com/Alive/plxs Earlier steps in this series: rL306040 rL306072 rL307404 (D34652) As acknowledged in the earlier review, there's a possibility that some Intel uarch would prefer to produce an xor to clear the fake register operand with sbb %eax, %eax. This will likely need to be addressed in a separate pass. llvm-svn: 307471
* [RegAllocFast] Don't insert kill flags of super-register for partial killQuentin Colombet2017-07-071-0/+34
| | | | | | | | | | | | | | | | | When reusing a register for a new definition, the fast register allocator used to insert a kill flag at the previous last use of that register to inform later passes that this register is free between the redef and the last use. However, this may be wrong when subregisters are involved. Indeed, a partially redef would have trigger a kill of the full super register, potentially wrongly marking all the other subregisters as free. Given we don't track which lanes are still live, we cannot set the kill flag in such case. Note: This bug has been latent for about 7 years (r104056). llvmg.org/PR33677 llvm-svn: 307428
OpenPOWER on IntegriCloud