summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* MachineFunction: Return reference for getFrameInfo(); NFCMatthias Braun2016-07-281-4/+4
| | | | | | | getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
* [AVX512] Add load folding support for the unmasked forms of the FMA ↵Craig Topper2016-07-251-0/+144
| | | | | | instructions. llvm-svn: 276615
* [X86] Make the FMA3 instruction names consistent between VEX and EVEX ↵Craig Topper2016-07-241-243/+215
| | | | | | | | encoded versions. This places the 132/213/231 form number in front of the SS/SD/PS/PD. Move the Y for 256-bit versions to be after the PS/PD. Change the AVX512 scalar forms to include a Z in the their name. This new format should be consistent with the general naming of instructions. llvm-svn: 276559
* [X86] Replace CodeGenOnly VPSRAVW/D/Q_Int instructions with patterns since ↵Craig Topper2016-07-241-2/+0
| | | | | | the operand types exactly match the normal VPSRAVW/D/Q instructions. llvm-svn: 276555
* [X86] Fix typo in comment.Craig Topper2016-07-231-1/+1
| | | | llvm-svn: 276528
* [AVX512] Implement commuting support for EVEX encoded FMA3 instructions.Craig Topper2016-07-231-218/+186
| | | | llvm-svn: 276521
* [X86] Make one of the FMA3 commuting methods static. Remove a call to isFMA3 ↵Craig Topper2016-07-231-198/+210
| | | | | | just to get the IsIntrisic flag, instead get it during the first call and pass it along. NFC llvm-svn: 276520
* [X86] Fix switch statement indentation per coding standards.Craig Topper2016-07-231-136/+136
| | | | llvm-svn: 276519
* [AVX512] Add initial support for the Execution Domain fixing pass to change ↵Craig Topper2016-07-221-1/+56
| | | | | | some EVEX instructions. llvm-svn: 276393
* [AVX512] Add load folding for some AVX512VL logic and arithmetic instructions.Craig Topper2016-07-221-0/+36
| | | | llvm-svn: 276391
* [AVX512] Update X86InstrInfo::foldMemoryOperandCustom to handle the EVEX ↵Craig Topper2016-07-221-4/+8
| | | | | | encoded instructions too. llvm-svn: 276390
* X86InstrInfo: No need for liveness analysis in classifyLEAReg()Matthias Braun2016-07-211-18/+2
| | | | | | | | | | | | | | | | | | | | classifyLEAReg() deals with switching operands from 32bit to 64bit in order to use a LEA64_32 instruction (for three address code goodness). It currently performs a liveness analysis to determine the kill/undef flag for the newly added operand. This should not be necessary: - If the previous operand had a kill flag, then the 32bit part of the register gets killed, this will kill the super register as well. - If the previous operand had an undef flag then we didn't care what value we read, just use the same flag on the new operand. (No matter what an operand with an undef flag won't affect liveness) This makes the code independent of the presence of kill flags because it avoids a call to MachineBasicBlock::computeRegisterLiveness(). Differential Revision: http://reviews.llvm.org/D22283 llvm-svn: 276222
* [AVX512] Add EVEX versions of scalar ADD/SUB/MUL/DIV to load folding tables.Craig Topper2016-07-181-10/+28
| | | | llvm-svn: 275775
* [AVX512] Add KADD/KAND/KOR/KXOR to X86InstrInfo::isAssociativeAndCommutative.Craig Topper2016-07-181-0/+16
| | | | llvm-svn: 275771
* [X86] Add VPMULLW/D/Q instructions to X86InstrInfo::isAssociativeAndCommutative.Craig Topper2016-07-181-0/+13
| | | | llvm-svn: 275770
* [X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative.Craig Topper2016-07-181-0/+24
| | | | llvm-svn: 275769
* [X86] Add floating point packed logical ops to ↵Craig Topper2016-07-181-0/+36
| | | | | | X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275768
* [X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative.Craig Topper2016-07-181-0/+50
| | | | llvm-svn: 275767
* [X86] Add more AVX512 instructions to X86InstrInfo::isHighLatencyDef. Also ↵Craig Topper2016-07-181-14/+247
| | | | | | add all packed fp division instructions. llvm-svn: 275766
* [X86] Add AVX512 load opcodes and a couple AVX load opcodes to ↵Craig Topper2016-07-181-0/+80
| | | | | | X86InstrInfo::areLoadsFromSameBasePtr. llvm-svn: 275765
* [X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly ↵Craig Topper2016-07-181-0/+80
| | | | | | AVX-512 related. llvm-svn: 275764
* [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when ↵Craig Topper2016-07-181-6/+15
| | | | | | | | VLX is supported. Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step. llvm-svn: 275763
* [X86] Fix 80-column violations. NFCCraig Topper2016-07-181-8/+16
| | | | llvm-svn: 275762
* [CodeGen] Take a MachineMemOperand::Flags in ↵Justin Lebar2016-07-151-2/+2
| | | | | | | | | | | | | | | | | MachineFunction::getMachineMemOperand. Summary: Previously we took an unsigned. Hooray for type-safety. Reviewers: chandlerc Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D22282 llvm-svn: 275591
* Rename AnalyzeBranch* to analyzeBranch*.Jacques Pienaar2016-07-151-2/+2
| | | | | | | | | | | | Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564
* XRay: Add entry and exit sledsDean Michael Berris2016-07-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In this patch we implement the following parts of XRay: - Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture. There are some caveats here: 1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet. 2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library. Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D19904 llvm-svn: 275367
* X86: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-121-21/+22
| | | | | | | | Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr*, mainly by preferring MachineInstr& over MachineInstr* and using range-based for loops. llvm-svn: 275149
* [AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one ↵Craig Topper2016-07-111-2/+14
| | | | | | vectors. llvm-svn: 275045
* [X86] Add the AVX512 SET0 pseudos to foldMemoryOperandImpl since they are ↵Craig Topper2016-07-111-1/+12
| | | | | | | | marked for CanFoldAsLoad. I don't really know how to test this. llvm-svn: 275044
* CodeGen: Use MachineInstr& in LiveVariables API, NFCDuncan P. N. Exon Smith2016-07-011-7/+7
| | | | | | | | | Change all the methods in LiveVariables that expect non-null MachineInstr* to take MachineInstr& and update the call sites. This clarifies the API, and designs away a class of iterator to pointer implicit conversions. llvm-svn: 274319
* CodeGen: Use MachineInstr& in TargetInstrInfo, NFCDuncan P. N. Exon Smith2016-06-301-503/+488
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `*` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr*` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189
* Drop support for creating $stubs.Rafael Espindola2016-06-291-1/+0
| | | | | | They are created by ld64 since OS X 10.5. llvm-svn: 274130
* Relax the clearance calculating for breaking partial register dependency.Dehao Chen2016-06-281-6/+16
| | | | | | | | | | | | Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency. Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D21560 llvm-svn: 274068
* Convert a few more comparisons to isPositionIndependent(). NFC.Rafael Espindola2016-06-271-2/+2
| | | | llvm-svn: 273945
* [AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic ↵Igor Breger2016-06-201-0/+2
| | | | | | | | intrinsic lowering. Differential Revision: http://reviews.llvm.org/D20897 llvm-svn: 273138
* Run clang-tidy's performance-unnecessary-copy-initialization over LLVM.Benjamin Kramer2016-06-121-1/+1
| | | | | | No functionality change intended. llvm-svn: 272516
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-16/+18
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* [X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR ↵Craig Topper2016-06-091-3/+3
| | | | | | instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions. llvm-svn: 272249
* Simplify handling of hidden stub.Rafael Espindola2016-05-171-1/+0
| | | | | | | | | Since r207518 they are printed exactly like non-hidden stubs on x86 and since r207517 on ARM. This means we can use a single set for all stubs in those platforms. llvm-svn: 269776
* Fix for PR27750. Correctly handle the case where the fallthrough block andDavid L Kreitzer2016-05-171-5/+9
| | | | | | | | target block are the same in getFallThroughMBB. Differential Revision: http://reviews.llvm.org/D20288 llvm-svn: 269760
* [X86] Properly check that EAX is dead when copying EFLAGS.Quentin Colombet2016-05-101-4/+9
| | | | | | | | | | | | This fixes a bug introduced in r267623, where we got smarter and avoided to save EAX before using it. However, we failed to check if any of the subregister of EAX were alive and thus, missed cases where we have to save EAX before using it. The problem may happen on every X86/i386/... platform. This fixes llvm.org/PR27624 llvm-svn: 269115
* [foldMemoryOperand()] Pass LiveIntervals to enable liveness check.Jonas Paulsson2016-05-101-3/+5
| | | | | | | | | | | | | | | SystemZ (and probably other targets as well) can fold a memory operand by changing the opcode into a new instruction that as a side-effect also clobbers the CC-reg. In order to do this, liveness of that reg must first be checked. When LIS is passed, getRegUnit() can be called on it and the right LiveRange is computed on demand. Reviewed by Matthias Braun. http://reviews.llvm.org/D19861 llvm-svn: 269026
* [X86][AVX512] Strengthen the assertions from r269001. We need VLX to use the ↵Craig Topper2016-05-101-2/+3
| | | | | | 128/256-bit move opcodes for extended registers. llvm-svn: 269019
* [X86][AVX512] Use the proper load/store for AVX512 registers.Quentin Colombet2016-05-101-8/+20
| | | | | | | | | | | | | | When loading or storing AVX512 registers we were not using the AVX512 variant of the load and store for VR128 and VR256 like registers. Thus, we ended up with the wrong encoding and actually were dropping the high bits of the instruction. The result was that we load or store the wrong register. The effect is visible only when we emit the object file directly and disassemble it. Then, the output of the disassembler does not match the assembly input. This is related to llvm.org/PR27481. llvm-svn: 269001
* [AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX ↵Craig Topper2016-05-081-0/+4
| | | | | | encoded VPXORD so all 32 registers can be used. llvm-svn: 268884
* livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFCMatthias Braun2016-05-031-1/+1
| | | | | | | The block must no be nullptr for the addLiveIns()/addLiveOuts() function. llvm-svn: 268340
* LivePhysRegs: Automatically determine presence of pristine regs.Matthias Braun2016-05-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | Remove the AddPristinesAndCSRs parameters from addLiveIns()/addLiveOuts(). We need to respect pristine registers after prologue epilogue insertion, Seeing that we got this wrong in at least two commits already, we should rather pay the small price to query MachineFrameInfo for it. There are three cases that did not set AddPristineAndCSRs to true even after register allocation: - ExecutionDepsFix: live-out registers are used as a hint that the register is used soon. This is not true for pristine registers so use the new addLiveOutsNoPristines() to maintain this behaviour. - SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like a bug, should do the right thing automatically now. - StackMapLivenessAnalysis: Not adding pristine registers looks like a bug to me. Added a FIXME comment but maintain the current behaviour as a change may need to get coordinated with GC runtimes. llvm-svn: 268336
* Enable the X86 call frame optimization for the 64-bit targets that allow it.David L Kreitzer2016-05-021-0/+6
| | | | | | | | Fixes PR27241. Differential Revision: http://reviews.llvm.org/D19688 llvm-svn: 268227
* Change AVX512 braodcastsd/ss patterns interaction with spilling . New ↵Igor Breger2016-05-011-41/+45
| | | | | | | | implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190
* [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.Craig Topper2016-04-301-10/+6
| | | | llvm-svn: 268164
OpenPOWER on IntegriCloud