summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [XRay] Reduce synthetic references emitted by XRayDean Michael Berris2017-06-218-47/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When we're building with XRay instrumentation, we use a trick that preserves references from the function to a function sled index. This index table lives in a separate section, and without this trick the linker is free to garbage-collect this section and all the segments it refers to. Until we're able to tell the linkers to preserve these sections, we use this reference trick to keep around both the index and the entries in the instrumentation map. Before this change we emitted both a synthetic reference to the label in the instrumentation map, and to the entry in the function map index. This change removes the first synthetic reference and only emits one synthetic reference to the index -- the index entry has the references to the labels in the instrumentation map, so the linker will still preserve those if the function itself is preserved. This reduces the amount of synthetic references we emit from 16 bytes to just 8 bytes in x86_64, and similarly to other platforms. Reviewers: dblaikie Subscribers: javed.absar, kpw, pelikan, llvm-commits Differential Revision: https://reviews.llvm.org/D34340 llvm-svn: 305880
* [ImplicitNullChecks] Uphold an invariant in areMemoryOpsAliasedSerguei Katkov2017-06-211-0/+293
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now areMemoryOpsAliased has an assertion justified as: MMO1 should have a value due it comes from operation we'd like to use as implicit null check. assert(MMO1->getValue() && "MMO1 should have a Value!"); However, it is possible for that invariant to not be upheld in the following situation (conceptually): Null check %RAX NotNullSucc: %RAX = LEA %RSP, 16 // I0 %RDX = MOV64rm %RAX // I1 With the current code, we will have an early exit from ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable. However, I1 will look plausible (since it loads from %RAX) and will go ahead and call areMemoryOpsAliased(I1, I0). This will cause us to fail the assert mentioned above since I1 does not load from an IR level value and thus is allowed to have a non-Value base address. The fix is to bail out earlier whenever we see an unsuitable instruction overwrite PointerReg. This would guarantee that when we call areMemoryOpsAliased, we're guaranteed to be looking at an instruction that loads from or stores to an IR level value. Original Patch Author: sanjoy Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34385 llvm-svn: 305879
* [AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32Stanislav Mekhanoshin2017-06-201-0/+101
| | | | | | | | | If there is an immediate operand we shall not shrink V_SUBB_U32 and V_ADDC_U32, it does not fit e32 encoding. Differential Revison: https://reviews.llvm.org/D34291 llvm-svn: 305840
* [GISel]: Add G_FMA opcode for fused multiply addsAditya Nandakumar2017-06-201-0/+12
| | | | | | | | https://reviews.llvm.org/D34372 Reviewed by dsanders llvm-svn: 305824
* AMDGPU: Do operand folding in program orderMatt Arsenault2017-06-201-0/+47
| | | | | | | | | Before it was possible to partially fold use instructions before the defs. After the xor is folded into a copy, the same mov can end up in the fold list twice, so on the second attempt it will fail expecting to see a register to fold. llvm-svn: 305821
* RegisterScavenging: Followup to r305625Matthias Braun2017-06-206-45/+45
| | | | | | | | | | | | | This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. llvm-svn: 305817
* AMDGPU: Preserve undef when folding register operandsMatt Arsenault2017-06-201-0/+6
| | | | | | | | If the source was a copy of an undef register, this would produce a read of an undefined register which is a verifier error. llvm-svn: 305816
* [AMDGPU] Eliminate SGPR to VGPR copy when possibleStanislav Mekhanoshin2017-06-206-8/+349
| | | | | | | | SGPRs are generally cheaper, so try to use them over VGPRs. Differential Revision: https://reviews.llvm.org/D34130 llvm-svn: 305815
* AMDGPU: Fix crash with undef vreg input operandMatt Arsenault2017-06-201-0/+21
| | | | llvm-svn: 305814
* [x86] enable CGP memcmp() expansion for 2/4/8 byte sizesSanjay Patel2017-06-201-23/+85
| | | | | | | | | There are a couple of potential improvements as seen in the IR and asm: 1. We're unnecessarily extending to a larger type to compare values. 2. The codegen for (select cond, 1, -1) could avoid a cmov. (or we could change the order of the compares, so we have a select with 0 operand) llvm-svn: 305802
* [X86][SSE] Relax 0/-1 vector element insertion to work for any vector with ↵Simon Pilgrim2017-06-202-31/+7
| | | | | | | | >=16bit elements Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this llvm-svn: 305801
* DAG: correctly legalize UMULO.Tim Northover2017-06-201-0/+16
| | | | | | | | | We were incorrectly sign extending into the high word (as you would for SMULO) when legalizing UMULO in terms of a wider full multiplication. Patch by James Duley. llvm-svn: 305800
* [globalisel][tablegen] Add support for COPY_TO_REGCLASS.Daniel Sanders2017-06-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: As part of this * Emitted instructions now have named MachineInstr variables associated with them. This isn't particularly important yet but it's a small step towards multiple-insn emission. * constrainSelectedInstRegOperands() is no longer hardcoded. It's now added as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses an alternate constraint mechanism ConstrainOperandToRegClassAction() which supports arbitrary constraints such as that defined by COPY_TO_REGCLASS. Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar Reviewed By: ab Subscribers: javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33590 llvm-svn: 305791
* Fixed test name. NFCI.Simon Pilgrim2017-06-201-7/+7
| | | | llvm-svn: 305787
* [GlobalISel][X86] Get correct RegClass for given RegBank.Igor Breger2017-06-201-6/+18
| | | | | | | | | | | | | | | | | | | Summary: In some cases RegClass depends on target feature. Hight (16-31) vector registers exist only if AVX512f available. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: t.p.northover, guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33952 Conflicts: test/CodeGen/X86/GlobalISel/select-memop-scalar.mir llvm-svn: 305784
* [GlobalISel] combine not symmetric merge/unmerge nodes.Igor Breger2017-06-201-6/+6
| | | | | | | | | | | | | | | | Summary: In some cases legalization ends up with not symmetric merge/unmerge nodes. Transform it to merge/unmerge nodes. Reviewers: t.p.northover, qcolombet, zvi Reviewed By: t.p.northover Subscribers: rovka, kristof.beyls, guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D33626 llvm-svn: 305783
* [GlobalISel][X86] add legalizer mir tests. NFCIgor Breger2017-06-202-73/+205
| | | | llvm-svn: 305781
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-201-0/+50
| | | | | | | | | | | | | | | | | | Resubmission of r305387, which was reverted at r305390. The Address Sanitizer caught a stack-use-after-scope of a Twine variable. This is now fixed by passing the Twine directly as a function parameter. The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305776
* AMDGPU: Fix scratch wave offset relative FI expansionMatt Arsenault2017-06-191-7/+47
| | | | | | | | The offset may not be an inline immediate, so this needs to be materialized into a register. The post-RA run of SIShrinkInstructions is able to fold it later if it can. llvm-svn: 305761
* [AMDGPU] Add infer address spaces pass before SROAStanislav Mekhanoshin2017-06-191-0/+10
| | | | | | | | | It adds it for the target after inlining but before SROA where we can get most out of it. Differential Revision: https://reviews.llvm.org/D34366 llvm-svn: 305759
* Fix machine instruction in test caseSanjoy Das2017-06-191-3/+3
| | | | | | | | | | The AMD64rm instruction used in the test case was incorrect. Since the first input register to AND64rm is tied to output register, they must be the same. Thanks for Jesper Antonsson for pointing this out! llvm-svn: 305756
* Revert r305382, it caused PR33513.Nico Weber2017-06-191-20/+20
| | | | llvm-svn: 305735
* [CGP, PowerPC] try to constant fold before creating loads for memcmp expansionSanjay Patel2017-06-191-25/+8
| | | | | | | | | | | | | | | This is the last step needed to avoid regressions for x86 before we flip the switch to allow expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings, so we need to do that here too. FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1 constant tests because its alignment requirements are too strict (shouldn't require alignment for a constant operand). Differential Revision: https://reviews.llvm.org/D34071 llvm-svn: 305734
* Revert r304824 "Fix PR23384 (part 3 of 3)"Hans Wennborg2017-06-197-67/+67
| | | | | | | | | | | | | | | | | This seems to be interacting badly with ASan somehow, causing false reports of heap-buffer overflows: PR33514. > Summary: > The patch makes instruction count the highest priority for > LSR solution for X86 (previously registers had highest priority). > > Reviewers: qcolombet > > Differential Revision: http://reviews.llvm.org/D30562 > > From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 305720
* Add test for store merge with noimplicitfloatNirav Dave2017-06-191-0/+23
| | | | llvm-svn: 305697
* AMDGPU/GlobalISel: Mark G_BITCAST s32 <--> <2 x s16> legalTom Stellard2017-06-191-0/+23
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34129 llvm-svn: 305692
* [GlobalISel][X86] Fold FI/G_GEP into LDR/STR instruction addressing mode.Igor Breger2017-06-197-171/+206
| | | | | | | | | | | | | | Summary: Implement some of the simplest addressing modes.It should help to test ABI. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33888 llvm-svn: 305691
* [ARM] GlobalISel: Support G_ICMP for s8 and s16Diana Picus2017-06-192-3/+74
| | | | | | Widen to s32 (like all other binary ops). llvm-svn: 305683
* [ARM] GlobalISel: Support G_ICMP for i32 and pointersDiana Picus2017-06-194-0/+457
| | | | | | | | | | | | | | Add support throughout the pipeline: - mark as legal for s32 and pointers - map to GPRs - lower to a sequence of instructions, which moves 0 or 1 into the result register based on the flags set by a CMPrr We have copied from FastISel a helper function which maps CmpInst predicates into ARMCC codes. Ideally, we should be able to move it somewhere that both FastISel and GlobalISel can use. llvm-svn: 305672
* [X86] Simplify vector-shuffle-v48 test. NFC.Guy Blank2017-06-191-42/+39
| | | | llvm-svn: 305670
* [x86] specify triples and auto-generate complete checks; NFCSanjay Patel2017-06-183-80/+174
| | | | llvm-svn: 305656
* [x86] specify triples and auto-generate complete checks; NFCSanjay Patel2017-06-184-122/+244
| | | | llvm-svn: 305655
* [x86] specify triple and auto-generate checks; NFCSanjay Patel2017-06-183-86/+182
| | | | llvm-svn: 305654
* x86] adjust test constants to maintain coverage; NFCSanjay Patel2017-06-184-33/+33
| | | | | | Increment (add 1) could be transformed to sub -1, and we'd lose coverage for these patterns. llvm-svn: 305646
* [x86] adjust test constants to maintain coverage; NFCSanjay Patel2017-06-181-7/+7
| | | | | | Increment (add 1) could be transformed to sub -1, and we'd lose coverage for these patterns. llvm-svn: 305645
* [x86] adjust test constants to maintain coverage; NFCSanjay Patel2017-06-181-10/+10
| | | | | | Increment (add 1) could be transformed to sub -1, and we'd lose coverage for these patterns. llvm-svn: 305644
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2017-06-1712-117/+238
| | | | | | | | | | | | | | | | | | | Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse to place spills as the very first instruciton of a basic block and thus artifically increase pressure (test in test/CodeGen/PowerPC/scavenging.mir:spill_at_begin) This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305625
* [SelectionDAG] Update Loop info after splitting critical edges.Davide Italiano2017-06-171-0/+27
| | | | | | The analysis is expected to be preserved by SelectionDAG. llvm-svn: 305621
* [WebAssembly] Use __stack_pointer global when writing wasm binarySam Clegg2017-06-164-46/+46
| | | | | | | | | | | | | | | | | | This ensures that symbolic relocations are generated for stack pointer manipulations. These relocations are of type R_WEBASSEMBLY_GLOBAL_INDEX_LEB. This change also adds support for reading relocations of this type in WasmObjectFile.cpp. Since its a globally imported symbol this does mean that the get_global/set_global instruction won't be valid until the objects are linked that global used in no longer an imported global. Differential Revision: https://reviews.llvm.org/D34172 llvm-svn: 305616
* Revert "RegScavenging: Add scavengeRegisterBackwards()"Matthias Braun2017-06-1612-181/+117
| | | | | | | | | Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. llvm-svn: 305566
* bpf: avoid load from read-only sectionsYonghong Song2017-06-164-0/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | If users tried to have a structure decl/init code like below struct test_t t = { .memeber1 = 45 }; It is very likely that compiler will generate a readonly section to hold up the init values for variable t. Later load of t members, e.g., t.member1 will result in a read from readonly section. BPF program cannot handle relocation. This will force users to write: struct test_t t = {}; t.member1 = 45; This is just inconvenient and unintuitive. This patch addresses this issue by implementing BPF PreprocessISelDAG. For any load from a global constant structure or an global array of constant struct, it attempts to translate it into a constant directly. The traversal of the constant struct and other constant data structures are similar to where the assembler emits read-only sections. Four different unit test cases are also added to cover different scenarios. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305560
* [Atomics] Rename and change prototype for atomic memcpy intrinsicDaniel Neilson2017-06-161-24/+21
| | | | | | | | | | | | | | | | | | Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558
* Revert "[mips][microMIPS] Extending size reduction pass with ADDIUSP and ↵Simon Dardis2017-06-161-17/+0
| | | | | | | | | ADDIUR1SP" This reverts commit r305455. This commit was reported as breaking one of the sanitizer buildbots. Reverting until lab.llvm.org comes back online. llvm-svn: 305557
* [Hexagon] Don't kill live registers when creating mux out of tfrKrzysztof Parzyszek2017-06-161-0/+17
| | | | | | | | | The second part of r305300: when placing the mux at the later location, make sure that it won't use any register that was killed between the two original instructions. Remove any such kills and transfer them to the mux. llvm-svn: 305553
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2017-06-1512-117/+181
| | | | | | | | | | | | | | | | | | Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305516
* DivergencyAnalysis patch for reviewAlexander Timofeev2017-06-152-0/+54
| | | | llvm-svn: 305494
* [MachineLICM] Hoist TOC-based address instructionsLei Huang2017-06-151-0/+110
| | | | | | | | | | | | | | | | | | Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490
* ISel: Fix FastISel of swifterror valuesArnold Schwaighofer2017-06-153-0/+163
| | | | | | | | | | | | The code assumed that we process instructions in basic block order. FastISel processes instructions in reverse basic block order. We need to pre-assign virtual registers before selecting otherwise we get def-use relationships wrong. This only affects code with swifterror registers. rdar://32659327 llvm-svn: 305484
* [PowerPC] fix potential verification errors on CFENCE8Hiroshi Inoue2017-06-153-12/+12
| | | | | | | | This patch fixes a potential verification error (64-bit register operands for cmpw) with -verify-machineinstrs. Differential Revision: https://reviews.llvm.org/D34208 llvm-svn: 305479
* [X86][AVX2] Fix issue in lowerV8I16GeneralSingleInputVectorShuffle that was ↵Simon Pilgrim2017-06-151-0/+18
| | | | | | | | | | assuming v8i16 vectors We can use this with v16i16/v32i16 as well. Found during fuzz testing. llvm-svn: 305472
OpenPOWER on IntegriCloud