summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
* Migrate function attribute "no-frame-pointer-elim"="false" to ↵Fangrui Song2019-12-242-2/+2
| | | | "frame-pointer"="none" as cleanups after D56351
* [GlobalISel]: Allow targets to override how to widen constants during ↵Aditya Nandakumar2019-12-034-5/+5
| | | | | | | | | | | | | | | legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.
* [globalisel] Rename G_GEP to G_PTR_ADDDaniel Sanders2019-11-0510-48/+48
| | | | | | | | | | | | | | | | Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734
* [GlobalISel][AArch64][AMDGPU][X86] Teach LegalizationArtifactCombiner to ↵Craig Topper2019-10-249-69/+42
| | | | | | | | combine trunc(g_constant). This allows X86 to properly form shift by immediate instructions since we require an 8-bit constant to match the imported SelectionDAG patterns.
* [Test] Fix inconsistency in alignment in test casePhilip Reames2019-10-031-7/+7
| | | | | | The IR was using a fixed 8 byte alignment, but the MIR portion was using native alignment. Since the test doesn't appear to be deliberately testing overalignment, just make the IR match the MIR. llvm-svn: 373658
* Add an operand to memory intrinsics to denote the "tail" marker.Amara Emerson2019-09-281-9/+9
| | | | | | | | | | | | | | We need to propagate this information from the IR in order to be able to safely do tail call optimizations on the intrinsics during legalization. Assuming it's safe to do tail call opt without checking for the marker isn't safe because the mem libcall may use allocas from the caller. This adds an extra immediate operand to the end of the intrinsics and fixes the legalizer to handle it. Differential Revision: https://reviews.llvm.org/D68151 llvm-svn: 373140
* [Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes ↵Guillaume Chatelet2019-09-11126-659/+659
| | | | | | | | | | | | | | | | | | | | | | mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608
* [GlobalISel] Import patterns containing SUBREG_TO_REGJessica Paquette2019-08-283-5/+11
| | | | | | | | | | | | | | | | | | | | | | | Reuse the logic for INSERT_SUBREG to also import SUBREG_TO_REG patterns. - Split `inferSuperRegisterClass` into two functions, one which tries to use an existing TreePatternNode (`inferSuperRegisterClassForNode`), and one that doesn't. SUBREG_TO_REG doesn't have a node to leverage, which is the cause for the split. - Rename GlobalISelEmitterInsertSubreg.td to GlobalISelEmitterSubreg.td and update it. - Update impacted tests in the AArch64 and X86 backends. This is kind of a hit/miss for code size improvements/regressions. E.g. in add-ext.ll, we now get some identity copies. This isn't really anything the importer can handle, since it's caused by a later pass introducing the copy for the sake of correctness. Differential Revision: https://reviews.llvm.org/D66769 llvm-svn: 370254
* Recommit "[GlobalISel] Import patterns containing INSERT_SUBREG"Jessica Paquette2019-08-274-17/+24
| | | | | | | | | | I thought `llvm::sort` was stable for some reason but it's not. Use `llvm::stable_sort` in `CodeGenTarget::getSuperRegForSubReg`. Original patch: https://reviews.llvm.org/D66498 llvm-svn: 370084
* Revert "[GlobalISel] Import patterns containing INSERT_SUBREG"Jessica Paquette2019-08-274-24/+17
| | | | | | | | | | When EXPENSIVE_CHECKS are enabled, GlobalISelEmitterSubreg.td doesn't get stable output. Reverting while I debug it. See: https://reviews.llvm.org/D66498 llvm-svn: 370080
* [GlobalISel] Import patterns containing INSERT_SUBREGJessica Paquette2019-08-264-17/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This teaches the importer to handle INSERT_SUBREG instructions. We were missing patterns involving INSERT_SUBREG in AArch64. It appears in AArch64InstrInfo.td 107 times, and 14 times in AArch64InstrFormats.td. To meaningfully import it, the GlobalISelEmitter needs to know how to infer a super register class for a given register class. This patch introduces the following: - `getSuperRegForSubReg`, a function which finds the largest register class which supports a value type and subregister index - `inferSuperRegisterClass`, a function which finds the appropriate super register class for an INSERT_SUBREG' - `inferRegClassFromPattern`, a function which allows for some trivial lookthrough into instructions - `getRegClassFromLeaf`, a helper function which returns the register class for a leaf `TreePatternNode` - Support for subregister index operands in `importExplicitUseRenderer` It also - Updates tests in each backend which are impacted by the change - Adds GlobalISelEmitterSubreg.td to test that we import and skip the expected patterns As a result of this patch, INSERT_SUBREG patterns in X86 may use the LOW32_ADDR_ACCESS_RBP register class instead of GR32. This is correct, since the register class contains the same registers as GR32 (except with the addition of RBP). So, this also teaches X86 to handle that register class. This is in line with X86ISelLowering, which treats this as a GR class. Differential Revision: https://reviews.llvm.org/D66498 llvm-svn: 369973
* GlobalISel: Don't create G_UADDE with constant false carry inMatt Arsenault2019-08-222-8/+10
| | | | | | | | The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673
* [globalisel] Add G_SEXT_INREGDaniel Sanders2019-08-092-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487
* [GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs ↵Amara Emerson2019-07-191-54/+9
| | | | | | | | | | | | | | and legalize later. I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't do that unless we delay lowering to actual function calls. This patch changes the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and then have each target specify that using the new custom legalizer for intrinsics hook that they want it expanded it a libcall. Differential Revision: https://reviews.llvm.org/D64895 llvm-svn: 366516
* [AArch64][GlobalISel] Make s8 and s16 G_CONSTANTs legal.Amara Emerson2019-06-211-6/+4
| | | | | | | | | | | | | | | | | | | | | We sometimes get poor code size because constants of types < 32b are legalized as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the localizer can no longer sink them (although it's possible to extend it to do so). On AArch64 however s8 and s16 constants can be selected in the same way as s32 constants, with a mov pseudo into a W register. If we make s8 and s16 constants legal then we can avoid unnecessary truncates, they can be CSE'd, and the localizer can sink them as normal. There is a caveat: if the user of a smaller constant has to widen the sources, we end up with an anyext of the smaller typed G_CONSTANT. This can cause regressions because of the additional extend and missed pattern matching. To remedy this, there's a new artifact combiner to generate the wider G_CONSTANT if it's legal for the target. Differential Revision: https://reviews.llvm.org/D63587 llvm-svn: 364075
* [X86] Introduce new MOVSSrm/MOVSDrm opcodes that use VR128 register class.Craig Topper2019-06-181-20/+20
| | | | | | | | | | | | | | | | | | | | | | Rename the old versions that use FR32/FR64 to MOVSSrm_alt/MOVSDrm_alt. Use the new versions in patterns that previously used a COPY_TO_REGCLASS to VR128. These patterns expect the upper bits to be zero. The current set up appears to work, but I'm not sure we should be enforcing upper bits being zero through a COPY_TO_REGCLASS. I wanted to flip the arrangement and use a COPY_TO_REGCLASS to FR32/FR64 for the patterns that need an f32/f64 result, but that complicated fastisel and globalisel. I've been doing some experiments with reducing some isel patterns and ended up in a situation where I had a (SUBREG_TO_REG (COPY_TO_RECLASS (VMOVSSrm), VR128)) and our post-isel peephole was unable to avoid using an instruction for the SUBREG_TO_REG due to the COPY_TO_REGCLASS. Having a VR128 instruction removes the COPY_TO_REGCLASS that was breaking this. llvm-svn: 363643
* Describe stack-id as an enumSander de Smalen2019-06-1712-43/+43
| | | | | | | | | | | | | | | | | This patch changes MIR stack-id from an integer to an enum, and adds printing/parsing support for this in MIR files. The default stack-id '0' is now renamed to 'default'. This should make MIR tests that have stack objects with different stack-ids more descriptive. It also clarifies code operating on StackID. Reviewers: arsenm, thegameg, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60137 llvm-svn: 363533
* [X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. ↵Craig Topper2019-05-254-15/+15
| | | | | | | | | | | | | | Support LEA64_32r properly. INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags. This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg. One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input. Differential Revision: https://reviews.llvm.org/D61472 llvm-svn: 361691
* [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with ↵Amara Emerson2019-04-155-87/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369
* [X86] Merge the different Jcc instructions for each condition code into ↵Craig Topper2019-04-052-7/+7
| | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon Reviewed By: RKSimon Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60228 llvm-svn: 357802
* [X86] Merge the different SETcc instructions for each condition code into ↵Craig Topper2019-04-053-98/+98
| | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri Reviewed By: andreadb Subscribers: hiraditya, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60138 llvm-svn: 357801
* [X86][GlobalISEL] Support lowering aligned unordered atomicsPhilip Reames2019-03-151-0/+952
| | | | | | | | The existing lowering code is accidentally correct for unordered atomics as far as I can tell. An unordered atomic has no memory ordering, and simply requires the actual load or store to be done as a single well aligned instruction. As such, relax the restriction while adding tests to ensure the lowering remains correct in the future. Differential Revision: https://reviews.llvm.org/D57803 llvm-svn: 356280
* [GlobalISel][Utils] Add a getConstantVRegVal variant that looks through instrsQuentin Colombet2019-03-143-21/+11
| | | | | | | | | | | | | | | | | | | getConstantVRegVal used to only look for G_CONSTANT when looking at unboxing the value of a vreg. However, constants are sometimes not directly used and are hidden behind trunc, s|zext or copy chain of computation. In particular this may be introduced by the legalization process that doesn't want to simplify these patterns because it can lead to infine loop when legalizing a constant. To circumvent that problem, add a new variant of getConstantVRegVal, named getConstantVRegValWithLookThrough, that allow to look through extensions. Differential Revision: https://reviews.llvm.org/D59227 llvm-svn: 356116
* [X86] Regenerate legalize test filesSimon Pilgrim2019-03-012-84/+151
| | | | | | Noticed while getting update_mir_test_checks.py to work on python3 llvm-svn: 355198
* GlobalISel: Consolidate load/store legalizationMatt Arsenault2019-02-051-1/+1
| | | | | | | | | | The fewerElementsVectors implementation for load/stores handles the scalar reduction case just as well, so drop the redundant code in narrowScalar. This also introduces support for narrowing irregular size breakdowns for scalars. llvm-svn: 353125
* GlobalISel: Enforce operand types for constantsMatt Arsenault2019-02-043-18/+18
| | | | | | | | A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113
* GlobalISel: Fix creating MMOs with align 0Matt Arsenault2019-01-311-73/+73
| | | | llvm-svn: 352712
* MIR: Reject non-power-of-4 alignments in MMO parsingMatt Arsenault2019-01-3010-104/+104
| | | | llvm-svn: 352686
* GlobalISel: Verify memory size for load/storeMatt Arsenault2019-01-306-148/+210
| | | | llvm-svn: 352578
* [AArch64][GlobalISel] Add some missing vector support for FP arithmetic ops.Amara Emerson2019-01-281-0/+48
| | | | | | | Moved the fneg lowering legalization test from AArch64 to X86, as we want to specify that it's already legal. llvm-svn: 352338
* [GISel]: Change how CSE is enabled by default for each passAditya Nandakumar2019-01-2412-17/+17
| | | | | | | | | | | | | | | https://reviews.llvm.org/D57178 Now add a hook in TargetPassConfig to query if CSE needs to be enabled. By default this hook returns false only for O0 opt level but this can be overridden by the target. As a consequence of the default of enabled for non O0, a few tests needed to be updated to not use CSE (by passing in -O0) to the run line. reviewed by: arsenm llvm-svn: 352126
* GlobalISel: Allow shift amount to be a different typeMatt Arsenault2019-01-2211-199/+131
| | | | | | | | | For AMDGPU the shift amount is never 64-bit, and this needs to use a 32-bit shift. X86 uses i8, but seemed to be hacking around this before. llvm-svn: 351882
* Remove irrelevant references to legacy git repositories fromJames Y Knight2019-01-152-2/+2
| | | | | | | | | compiler identification lines in test-cases. (Doing so only because it's then easier to search for references which are actually important and need fixing.) llvm-svn: 351200
* [x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEASanjay Patel2018-12-123-6/+8
| | | | | | | | | | This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. That allows us to combine add ops with register moves and save some instructions. This is another step towards allowing add truncation in generic DAGCombiner (see D54640). Differential Revision: https://reviews.llvm.org/D55494 llvm-svn: 348946
* [GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes.Amara Emerson2018-12-105-31/+31
| | | | | | | | | | | | This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788
* [x86] don't try to convert add with undef operands to LEASanjay Patel2018-12-091-3/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing code tries to handle an undef operand while transforming an add to an LEA, but it's incomplete because we will crash on the i16 test with the debug output shown below. It's better to just give up instead. Really, GlobalIsel should have folded these before we could get into trouble. # Machine code for function add_undef_i16: NoPHIs, TracksLiveness, Legalized, RegBankSelected, Selected bb.0 (%ir-block.0): liveins: $edi %1:gr32 = COPY killed $edi %0:gr16 = COPY %1.sub_16bit:gr32 %5:gr64_nosp = IMPLICIT_DEF %5.sub_16bit:gr64_nosp = COPY %0:gr16 %6:gr64_nosp = IMPLICIT_DEF %6.sub_16bit:gr64_nosp = COPY %2:gr16 %4:gr32 = LEA64_32r killed %5:gr64_nosp, 1, killed %6:gr64_nosp, 0, $noreg %3:gr16 = COPY killed %4.sub_16bit:gr32 $ax = COPY killed %3:gr16 RET 0, implicit killed $ax # End machine code for function add_undef_i16. *** Bad machine code: Reading virtual register without a def *** - function: add_undef_i16 - basic block: %bb.0 (0x7fe6cd83d940) - instruction: %6.sub_16bit:gr64_nosp = COPY %2:gr16 - operand 1: %2:gr16 LLVM ERROR: Found 1 machine code errors. Differential Revision: https://reviews.llvm.org/D54710 llvm-svn: 348722
* Revert r345165 "[X86] Bring back the MOV64r0 pseudo instruction"Craig Topper2018-10-311-1/+1
| | | | | | Google is reporting regressions on some benchmarks. llvm-svn: 345785
* [X86] Bring back the MOV64r0 pseudo instructionCraig Topper2018-10-241-1/+1
| | | | | | | | | | | | This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos. My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at. There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling. Differential Revision: https://reviews.llvm.org/D52757 llvm-svn: 345165
* [GlobalISel] Fix the artifact combiner to fold G_IMPLICIT_DEF properlyVolkan Keles2018-10-102-30/+30
| | | | | | | | | | | | | | | | | | | | Summary: GlobalISel generates incorrect code because the legalizer artifact combiner assumes `G_[SZ]EXT (G_IMPLICIT_DEF)` is equivalent to `G_IMPLICIT_DEF `. Replace `G_[SZ]EXT (G_IMPLICIT_DEF)` with 0 because the top bits will be 0 for G_ZEXT and 0/1 for the G_SEXT. Reviewers: aditya_nandakumar, dsanders, aemerson, javed.absar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D52996 llvm-svn: 344163
* [GlobalIsel][X86] Support G_UDIV/G_UREM/G_SREMAlexander Ivchenko2018-10-0813-0/+2985
| | | | | | | | | | Support G_UDIV/G_UREM/G_SREM. The instruction selection code is taken from FastISel with only minor tweaks to adapt for GlobalISel. Differential Revision: https://reviews.llvm.org/D49781 llvm-svn: 343966
* [GISel]: Remove an incorrect assert in CallLoweringAditya Nandakumar2018-09-281-0/+13
| | | | | | | | | | | https://reviews.llvm.org/D51147 Asserting if any extend of vectors should be up to the target's legalizer/target specific code not in CallLowering. reviewed by : dsanders. llvm-svn: 343325
* Fix line-endings. NFCI.Simon Pilgrim2018-09-2019-1082/+1082
| | | | llvm-svn: 342639
* [X86] Handle COPYs of physregs better (regalloc hints)Simon Pilgrim2018-09-1919-1034/+1082
| | | | | | | | | | | | | | Enable enableMultipleCopyHints() on X86. Original Patch by @jonpa: While enabling the mischeduler for SystemZ, it was discovered that for some reason a test needed one extra seemingly needless COPY (test/CodeGen/SystemZ/call-03.ll). The handling for that is resulted in this patch, which improves the register coalescing by providing not just one copy hint, but a sorted list of copy hints. On SystemZ, this gives ~12500 less register moves on SPEC, as well as marginally less spilling. Instead of improving just the SystemZ backend, the improvement has been implemented in common-code (calculateSpillWeightAndHint(). This gives a lot of test failures, but since this should be a general improvement I hope that the involved targets will help and review the test updates. Differential Revision: https://reviews.llvm.org/D38128 llvm-svn: 342578
* Copy utilities updated and added for MI flagsMichael Berg2018-09-191-0/+228
| | | | | | | | | | | | | | Summary: This patch adds a GlobalIsel copy utility into MI for flags and updates the instruction emitter for the SDAG path. Some tests show new behavior and I added one for GlobalIsel which mirrors an SDAG test for handling nsw/nuw. Reviewers: spatel, wristow, arsenm Reviewed By: arsenm Subscribers: wdng Differential Revision: https://reviews.llvm.org/D52006 llvm-svn: 342576
* Revert r342457 "Fixes removal of dead elements from PressureDiff (PR37252)."Hans Wennborg2018-09-181-2/+2
| | | | | | | | | | | This broke the lit tests on a bunch of buildbots, e.g. http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/36679 > Reviewed By: MatzeB > > Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342482
* Fixes removal of dead elements from PressureDiff (PR37252).Yury Gribov2018-09-181-2/+2
| | | | | | | | Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342457
* [GlobalISel][X86] Add the support for G_FPTRUNCAlexander Ivchenko2018-08-314-0/+137
| | | | | | Differential Revision: https://reviews.llvm.org/D49855 llvm-svn: 341202
* [GlobalISel][X86_64] Support for G_FPTOSIAlexander Ivchenko2018-08-313-0/+892
| | | | | | Differential Revision: https://reviews.llvm.org/D49183 llvm-svn: 341200
* [GlobalIsel][X86] Support for llvm.trap intrinsicAlexander Ivchenko2018-08-311-0/+28
| | | | | | Differential Revision: https://reviews.llvm.org/D49180 llvm-svn: 341199
* [GlobalIsel][X86] Support for G_FCMPAlexander Ivchenko2018-08-313-0/+3800
| | | | | | Differential Revision: https://reviews.llvm.org/D49172 llvm-svn: 341193
OpenPOWER on IntegriCloud