summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrArithmetic.td
Commit message (Collapse)AuthorAgeFilesLines
* [X86][BMI] Pull out schedule classes from bmi_andn<> and bmi_bls<>Simon Pilgrim2019-10-211-5/+5
| | | | | | Stop hardwiring classes llvm-svn: 375470
* [X86] Teach convertToThreeAddress to handle SUB with immediateCraig Topper2019-07-151-7/+8
| | | | | | | | | | | | We mostly avoid sub with immediate but there are a couple cases that can create them. One is the add 128, %rax -> sub -128, %rax trick in isel. The other is when a SUB immediate gets created for a compare where both the flags and the subtract value is used. If we are unable to linearize the SelectionDAG to satisfy the flag user and the sub result user from the same instruction, we will clone the sub immediate for the two uses. The one that produces flags will eventually become a compare. The other will have its flag output dead, and could then be considered for LEA creation. I added additional test cases to add.ll to show the the sub -128 trick gets converted to LEA and a case where we don't need to convert it. This showed up in the current codegen for PR42571. Differential Revision: https://reviews.llvm.org/D64574 llvm-svn: 366151
* [X86] Change IMUL with immediate instruction order to ri8 instructions come ↵Craig Topper2019-04-141-31/+34
| | | | | | | | | | before ri/ri32 instructions. This will ensure IMUL64ri8 is tried before IMUL64ri32. For IMUL32 and IMUL16 the order doesn't really matter because only the ri8 versions use a predicate. That automatically gives them priority. llvm-svn: 358360
* Revert r356688 "[X86] Don't avoid folding multiple use sign extended 8-bit ↵Craig Topper2019-03-251-3/+3
| | | | | | | | immediate into instructions under optsize." Looking back over how the one use optimization works, I don't think this is the right way to fix this. llvm-svn: 356866
* [X86] Don't avoid folding multiple use sign extended 8-bit immediate into ↵Craig Topper2019-03-211-3/+3
| | | | | | | | | | | | instructions under optsize. Under optsize we try to avoid folding immediates into instructions under optsize. But if the immediate is 16-bits or 32 bits, but can be encoded as an 8-bit immediate we don't save enough from disabling the folding unless the immediate has enough uses to make up for the size of the move which is either 3 bytes or 5 bytes since there are no sign extended 8-bit moves. We would also save something if the immediate was a live out of the basic block and thus a move was unavoidable, but that would require a more advanced heuristic than just counting uses. Note we only avoid folding multiple use immediates into the patterns that use X86ISD::ADD/SUB/XOR/OR/AND/CMP/ADC/SBB nodes and not the more common ISD::ADD/SUB/XOR/OR/AND nodes. Differential Revision: https://reviews.llvm.org/D59522 llvm-svn: 356688
* [X86] Replace uses of i64immSExt32_su with i64relocImmSExt32_su.Craig Topper2019-03-181-1/+1
| | | | | | For the i8, i16, and i32 instructions we were using a relocImm. Presumably we should for i64 as well. llvm-svn: 356406
* [X86] Rename imm8_su/imm16_su/imm32_su to ↵Craig Topper2019-03-181-3/+3
| | | | | | relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are. llvm-svn: 356393
* [X86] Allow 8-bit INC/DEC to be converted to LEA.Craig Topper2019-03-051-4/+2
| | | | | | | | We already do this for 16/32/64 as well as 8-bit add with register/immediate. Might as well do it for 8-bit INC/DEC too. Differential Revision: https://reviews.llvm.org/D58869 llvm-svn: 355424
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [X86] Remove X86ISD::INC/DEC. Just select them from X86ISD::ADD/SUB at isel timeCraig Topper2019-01-021-8/+23
| | | | | | | | | | | | | | INC/DEC are pretty much the same as ADD/SUB except that they don't update the C flag. This patch removes the special nodes and just pattern matches from ADD/SUB during isel if the C flag isn't being used. I had to avoid selecting DEC is the result isn't used. This will become a SUB immediate which will turned into a CMP later by optimizeCompareInstr. This lead to the one test change where we use a CMP instead of a DEC for an overflow intrinsic since we only checked the flag. This also exposed a hole in our RMW flag matching use of hasNoCarryFlagUses. Our root node for the match is a store and there's no guarantee that all the flag users have been selected yet. So hasNoCarryFlagUses needs to check copyToReg and machine opcodes, but it also needs to check for the pre-match SETCC, SETCC_CARRY, BRCOND, and CMOV opcodes. Differential Revision: https://reviews.llvm.org/D55975 llvm-svn: 350245
* [X86] Remove the ANDN check from EmitTest.Craig Topper2018-12-241-4/+6
| | | | | | | | Remove the TESTmr isel patterns and add another postprocessing combine for TESTrr+ANDrm->TESTmr. We already have a postprocessing combine for TESTrr+ANDrr->TESTrr. With this we can give ANDN a chance to match first. And clean it up during post processing if we ended up with just a regular AND. This is another step towards my plan to gut EmitTest and do more flag handling during isel matching or by using optimizeCompare. llvm-svn: 350038
* [X86] Don't match TESTrr from (cmp (and X, Y), 0) during isel. Defer to post ↵Craig Topper2018-12-191-4/+7
| | | | | | | | | | | | processing The (cmp (and X, Y) 0) pattern is greedy and ends up forming a TESTrr and consuming the and when it might be better to use one of the BMI/TBM like BLSR or BLSI. This patch moves removes the pattern from isel and adds a post processing check to combine TESTrr+ANDrr into just a TESTrr. With this patch we are able to select the BMI/TBM instructions, but we'll also emit a TESTrr when the result is compared to 0. In many cases the peephole pass will be able to use optimizeCompareInstr to remove the TEST, but its probably not perfect. Differential Revision: https://reviews.llvm.org/D55870 llvm-svn: 349661
* [x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEASanjay Patel2018-12-121-3/+3
| | | | | | | | | | This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. That allows us to combine add ops with register moves and save some instructions. This is another step towards allowing add truncation in generic DAGCombiner (see D54640). Differential Revision: https://reviews.llvm.org/D55494 llvm-svn: 348946
* [X86] Move ReadAfterLd functionality into X86FoldableSchedWrite (PR36957)Simon Pilgrim2018-10-051-29/+29
| | | | | | | | | | | | Currently we hardcode instructions with ReadAfterLd if the register operands don't need to be available until the folded load has completed. This doesn't take into account the different load latencies of different memory operands (PR36957). This patch adds a ReadAfterFold def into X86FoldableSchedWrite to replace ReadAfterLd, allowing us to specify the load latency at a scheduler class level. I've added ReadAfterVec*Ld classes that match the XMM/Scl, XMM and YMM/ZMM WriteVecLoad classes that we currently use, we can tweak these values in future patches once this infrastructure is in place. Differential Revision: https://reviews.llvm.org/D52886 llvm-svn: 343868
* [X86] Split WriteIMul into 8/16/32/64 implementations (PR36931)Simon Pilgrim2018-09-241-31/+31
| | | | | | | | Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases. This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions. llvm-svn: 342892
* [X86] Remove isel patterns for ADCX instructionCraig Topper2018-09-121-30/+7
| | | | | | | | | | | | There's no advantage to this instruction unless you need to avoid touching other flag bits. It's encoding is longer, it can't fold an immediate, it doesn't write all the flags. I don't think gcc will generate this instruction either. Fixes PR38852. Differential Revision: https://reviews.llvm.org/D51754 llvm-svn: 342059
* [X86] Mark the ADCX and ADOX instruction as commutable.Craig Topper2018-09-081-1/+1
| | | | llvm-svn: 341752
* [X86] Add commuted isel pattern for the load form of ADCX instructions.Craig Topper2018-09-081-0/+9
| | | | | | This prevents the legacy ADC instruction from being favored over ADCX when the load is in the operand 0. llvm-svn: 341745
* [X86] Add RMW ADC patterns with load in operand 1.Craig Topper2018-09-061-8/+22
| | | | | | | | ADC is commutable and the load could be in either operand, but we were only checking operand 0. Ideally we'd mark X86adc_flag as commutable and tablegen would automatically do this, but the EFLAGS register mention is preventing it. llvm-svn: 341606
* [X86] Add isel patterns for commuting X86adc_flag with a load in the LHS.Craig Topper2018-09-061-0/+10
| | | | | | | | The peephole pass likely gets this normally, but we should be doing it during isel. Ideally we'd just make the X86adc_flag pattern SDNPCommutable, but the tablegen doesn't handle that when one of the operands is a register reference. llvm-svn: 341596
* [X86] Remove AddedComplexity from register form of NOT. NFCICraig Topper2018-07-101-3/+0
| | | | | | | | I believe isProfitableToFold will stop the load folding that this was intended to overcome. Given an (xor load, -1), isProfitableToFold will see that the immediate can be folded with the xor using a one byte immediate since it can be sign extended. It doesn't know about NOT, but the one byte immediate check is enough to stop the fold. llvm-svn: 336712
* [X86] Split WriteADC/WriteADCRMW scheduler classesSimon Pilgrim2018-05-171-32/+33
| | | | | | For integer ALU instructions taking eflags as an input (ADC/SBB/ADCX/ADOX) llvm-svn: 332605
* [X86] Split off WriteIMul64 from WriteIMul schedule class (PR36931)Simon Pilgrim2018-05-081-44/+43
| | | | | | | This fixes a couple of BtVer2 missing instructions that weren't been handled in the override. NOTE: There are still a lot of overrides that still need cleaning up! llvm-svn: 331770
* [X86] Split WriteIDiv into div/idiv 8/16/32/64 implementations (PR36930)Simon Pilgrim2018-05-081-21/+16
| | | | | | | I've created the necessary classes but there are still a lot of overrides that need cleaning up. NOTE: The Znver1 model was missing some div/idiv variants in the instregex patterns and wasn't setting the resource cycles at all in the overrides. llvm-svn: 331767
* [X86] Remove remaining gpr schedule itineraries (PR37093)Simon Pilgrim2018-04-121-208/+156
| | | | llvm-svn: 329938
* [X86] Attempt to model basic arithmetic instructions in the ↵Craig Topper2018-04-061-13/+14
| | | | | | | | | | | | | | | | | | | | | Haswell/Broadwell/Skylake scheduler models without InstRWs Summary: This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency. Apparently we were inconsistent about whether the store has latency or not thus the test changes. I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5. Reviewers: RKSimon, andreadb Reviewed By: andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45351 llvm-svn: 329416
* [X86] Use loadi16/loadi32 predicates in multiply patternsCraig Topper2018-04-041-9/+9
| | | | llvm-svn: 329153
* [X86] Cleanup ADCX/ADOX instruction definitions.Craig Topper2018-04-011-29/+45
| | | | | | Give them both the same itineraries. Add hasSideEffects = 0 to ADOX since they don't have patterns. Rename source operands to $src1 and $src2 instead of $src0 and $src. Add ReadAfterLd to the memory form SchedRW. llvm-svn: 328952
* [x86] Correct the operand structure of the ADOX instruction.Chandler Carruth2018-04-011-19/+11
| | | | | | | | | | | | | | | This also moves to define it in the same way as ADCX which seems to use constraints a bit better. This is pulled out of the review for reducing the use of popf for restoring EFLAGS, but is independent. There are still more problems with our definitions for these instructions that Craig is going to look at but this is at least less broken and he can start from this to improve them more fully. Thanks to Craig for the review here. llvm-svn: 328945
* [X86] Fix the SchedRW for memory forms of CMP and TEST.Craig Topper2018-03-201-18/+21
| | | | | | | | They were incorrectly marked as RMW operations. Some of the CMP instrucions worked, but the ones that use a similar encoding as RMW form of ADD ended up marked as RMW. TEST used the same tablegen class as some of the CMPs. llvm-svn: 327947
* [X86] Make the multiply and divide itineraries more consistent.Craig Topper2018-03-191-29/+32
| | | | | | | | | | Sometimes we used the same itinerary for MEM and REG forms, but that seems inconsistent with our usual usage. We also used the MUL8 itinerary for MULX32/64 which was also weird. The test changes are because we were using IIC_IMUL32_RR and IIC_IMUL64_RR instead of IIC_IMUL32_REG/IIC_IMUL64_REG for the 32 and 64 bit multiplies that produce double width result. llvm-svn: 327866
* [X86] Change some compare patterns to use loadi8/loadi16/loadi32/loadi64 ↵Craig Topper2018-02-121-4/+5
| | | | | | | | helper fragments. This enables CMP8mi to fold zextloadi8i1 which in all tests allows us to avoid creating a TEST8rr that peephole can't fold. llvm-svn: 324863
* [X86] Artificially lower the complexity of the scalar ANDN patterns so that ↵Craig Topper2018-02-051-2/+3
| | | | | | | | | | | | AND with immediate will match first. This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate. This also allows us to match an and to a movzx after a not. This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT. llvm-svn: 324260
* [X86] Avoid using high register trick for test instructionAmaury Sechet2018-01-311-8/+0
| | | | | | | | | | | | | Summary: It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger. Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42646 llvm-svn: 323888
* Revert "[X86] Avoid using high register trick for test instruction"Eric Liu2018-01-301-0/+8
| | | | | | This reverts commit r323690. This causes crash in llc. See the original commit thread for details. llvm-svn: 323761
* [X86] Avoid using high register trick for test instructionAmaury Sechet2018-01-291-8/+0
| | | | | | | | | | | | | | | Summary: It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger. The main noteworthy regression I was able to observe was pattern of the type (setcc (trunc (and X, C)), 0) where C is such as it would benefit from the hi register trick. To prevent this, a new pattern is added to materialize such pattern using a 32 bits test. This has the added benefit of working with any constant that is materializable as a 32bits immediate, not just the ones that can leverage the high register trick, as demonstrated by the test case in test-shrink.ll using the constant 2049 . Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42646 llvm-svn: 323690
* [X86] Add 'Requires<[In64BitMode]>' to a bunch of instructions that only ↵Craig Topper2017-12-151-8/+28
| | | | | | | | have memory and immediate operands. The asm parser wasn't preventing these from being accepted in 32-bit mode. Instructions that use a GR64 register are protected by the parser rejecting the register in 32-bit mode. llvm-svn: 320846
* [X86] Change register&memory TEST instructions from MRMSrcMem to MRMDstMemCraig Topper2017-10-011-11/+10
| | | | | | | | | | | | | | | | | | | Summary: Intel documentation shows the memory operand as the first operand. But we currently treat it as the second operand. Conceptually the order doesn't matter since it doesn't write memory. We have aliases to parse with the operands in either order and the isel matching is commutable. For the register&register form order does matter for the assembly parser. PR22995 was previously filed and fixed by changing the register&register form from MRMSrcReg to MRMDestReg to match gas. Ideally the memory form should match by using MRMDestMem. I believe this supercedes D38025 which was trying to switch the register&register form back to pre-PR22995. Reviewers: aymanmus, RKSimon, zvi Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38120 llvm-svn: 314639
* [X86] Remove unused tablegen class.Craig Topper2017-09-211-7/+0
| | | | llvm-svn: 313861
* [X86] Don't disable slow INC/DEC if optimizing for sizeCraig Topper2017-09-091-2/+2
| | | | | | | | | | | | | | | | | Summary: Just because INC/DEC is a little slow on some processors doesn't mean we shouldn't prefer it when optimizing for size. This appears to match gcc behavior. Reviewers: chandlerc, zvi, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37177 llvm-svn: 312866
* [X86] Qualify the RMW INC/DEC patterns with NotSlowIncDec.Craig Topper2017-08-261-2/+2
| | | | | | We were suppressing most uses of INC/DEC, but this one seems to have been missed. llvm-svn: 311828
* [X86] Adding FoldGenRegForm helper field (for memory folding tables tableGen ↵Ayman Musa2017-05-281-12/+12
| | | | | | | | | | | | | | | | | | | | | | backend) to X86Inst class and set its value for the relevant instructions. Some register-register instructions can be encoded in 2 different ways, this happens when 2 register operands can be folded (separately). For example if we look at the MOV8rr and MOV8rr_REV, both instructions perform exactly the same operation, but are encoded differently. Here is the relevant information about these instructions from Intel's 64-ia-32-architectures-software-developer-manual: Opcode Instruction Op/En 64-Bit Mode Compat/Leg Mode Description 8A /r MOV r8,r/m8 RM Valid Valid Move r/m8 to r8. 88 /r MOV r/m8,r8 MR Valid Valid Move r8 to r/m8. Here we can see that in order to enable the folding of the output and input registers, we had to define 2 "encodings", and as a result we got 2 move 8-bit register-register instructions. In the X86 backend, we define both of these instructions, usually one has a regular name (MOV8rr) while the other has "_REV" suffix (MOV8rr_REV), must be marked with isCodeGenOnly flag and is not emitted from CodeGen. Automatically generating the memory folding tables relies on matching encodings of instructions, but in these cases where we want to map both memory forms of the mov 8-bit (MOV8rm & MOV8mr) to MOV8rr (not to MOV8rr_REV) we have to somehow point from the MOV8rr_REV to the "regular" appropriate instruction which in this case is MOV8rr. This field enable this "pointing" mechanism - which is used in the TableGen backend for generating memory folding tables. Differential Revision: https://reviews.llvm.org/D32683 llvm-svn: 304087
* [X86] Add missing mayLoad/mayStore attributes to some X86 instructions ↵Ayman Musa2017-04-261-4/+6
| | | | | | | | | | (Continue) Complete the patch committed in rL300190. Differential Revision: https://reviews.llvm.org/D32287 llvm-svn: 301393
* [x86] Allow merging multiple instances of an immediate within a basic block ↵Sanjay Patel2016-08-161-1/+1
| | | | | | | | | | | | | | for code size savings, for 64-bit constants. This patch handles 64-bit constants which can be encoded as 32-bit immediates. It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants. Patch by Sunita Marathe! Differential Revision: https://reviews.llvm.org/D23391 llvm-svn: 278857
* [X86] Fix CMP and TEST with al/ax/eax/rax to not mark EFLAGS as a use or ↵Craig Topper2015-10-111-27/+34
| | | | | | al/ax/eax/rax as a def. Probably doesn't have a functional affect since these aren't used in isel. llvm-svn: 249994
* [X86] Allow merging of immediates within a basic block for code size savingsMichael Kuperstein2015-08-111-4/+4
| | | | | | | | | | | First step in preventing immediates that occur more than once within a single basic block from being pulled into their users, in order to prevent unnecessary large instruction encoding .Currently enabled only when optimizing for size. Patch by: zia.ansari@intel.com Differential Revision: http://reviews.llvm.org/D11363 llvm-svn: 244601
* Fix the operand encoding in the test instruction.Rafael Espindola2015-03-311-4/+4
| | | | | | Fixes pr22995. llvm-svn: 233686
* [X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the ↵Craig Topper2015-01-081-2/+2
| | | | | | LEA variants in Intel syntax. The memory operand is inherently unsized. llvm-svn: 225432
* [X86] Make isel select the 2-byte register form of INC/DEC even in ↵Craig Topper2015-01-061-72/+42
| | | | | | | | non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering. Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness. llvm-svn: 225252
* [X86] Remove the predicates from the register forms of the 2-byte inc and ↵Craig Topper2015-01-051-43/+22
| | | | | | dec instructions. Remove the 32-bit mode only versions that existed for the disassembler. Move the patterns out of the instructions so they can still be qualified with predicates. llvm-svn: 225157
OpenPOWER on IntegriCloud