summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFCMatthias Braun2016-05-033-3/+3
| | | | | | | The block must no be nullptr for the addLiveIns()/addLiveOuts() function. llvm-svn: 268340
* LivePhysRegs: Automatically determine presence of pristine regs.Matthias Braun2016-05-032-2/+2
| | | | | | | | | | | | | | | | | | | | | | Remove the AddPristinesAndCSRs parameters from addLiveIns()/addLiveOuts(). We need to respect pristine registers after prologue epilogue insertion, Seeing that we got this wrong in at least two commits already, we should rather pay the small price to query MachineFrameInfo for it. There are three cases that did not set AddPristineAndCSRs to true even after register allocation: - ExecutionDepsFix: live-out registers are used as a hint that the register is used soon. This is not true for pristine registers so use the new addLiveOutsNoPristines() to maintain this behaviour. - SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like a bug, should do the right thing automatically now. - StackMapLivenessAnalysis: Not adding pristine registers looks like a bug to me. Added a FIXME comment but maintain the current behaviour as a change may need to get coordinated with GC runtimes. llvm-svn: 268336
* [X86] Model FAULTING_LOAD_OP as a terminator and branch.Quentin Colombet2016-05-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327
* [X86][SSE] Added placeholder for 128/256-bit wide shuffle combinesSimon Pilgrim2016-05-021-6/+14
| | | | | | Begun adding placeholder for future support for vperm2f128/vshuff64x2 style 128/256-bit wide shuffles llvm-svn: 268306
* [X86][SSE] Dropped X86ISD::FGETSIGNx86 and use MOVMSK instead for FGETSIGN ↵Simon Pilgrim2016-05-024-37/+12
| | | | | | | | lowering movmsk.ll tests are unchanged. llvm-svn: 268237
* Enable the X86 call frame optimization for the 64-bit targets that allow it.David L Kreitzer2016-05-022-16/+36
| | | | | | | | Fixes PR27241. Differential Revision: http://reviews.llvm.org/D19688 llvm-svn: 268227
* [X86] Fix a bug in LOCK arithmetic operation pattern matching where the ↵Craig Topper2016-05-021-1/+1
| | | | | | | | wrong immediate predicate check was being used for 64-bit instructions with 8-bit immediates. This didn't cause a bug because the order of the patterns ensured that the 64-bit instructions with 32-bit immediates were selected first. llvm-svn: 268212
* [AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While ↵Craig Topper2016-05-011-4/+4
| | | | | | there fix the execution domain for VPACKSSDW/VPACKUSDW. llvm-svn: 268200
* Change AVX512 braodcastsd/ss patterns interaction with spilling . New ↵Igor Breger2016-05-013-110/+98
| | | | | | | | implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190
* [AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when ↵Craig Topper2016-05-011-3/+3
| | | | | | VLX and BWI are supported. llvm-svn: 268189
* [AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB ↵Craig Topper2016-05-011-13/+14
| | | | | | and VPMADDUBSW/VPMADDWD. llvm-svn: 268188
* [AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also ↵Craig Topper2016-05-011-16/+16
| | | | | | marked as requiring VLX. llvm-svn: 268186
* [X86] Add an AddedComplexity to another pattern to put it near similar in ↵Craig Topper2016-05-011-2/+1
| | | | | | the output file. llvm-svn: 268184
* [X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere ↵Craig Topper2016-05-011-2/+0
| | | | | | with an AddedComplexity that made this unreachable. llvm-svn: 268183
* [X86] Add AddedComplexity to keep some similar patterns near each other in ↵Craig Topper2016-05-011-0/+1
| | | | | | the output file. llvm-svn: 268181
* [X86] Remove some redundant selection patterns.Craig Topper2016-05-012-11/+0
| | | | llvm-svn: 268180
* [AVX512] Replace vector_extract with extractelt in some patterns. They mean ↵Craig Topper2016-05-011-5/+5
| | | | | | the same thing but vector_extract is deprecated. NFC llvm-svn: 268179
* [AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions.Craig Topper2016-05-011-4/+7
| | | | llvm-svn: 268174
* [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.Craig Topper2016-04-302-13/+9
| | | | llvm-svn: 268164
* Differential Revision: http://reviews.llvm.org/D19733Sriraman Tallam2016-04-292-3/+2
| | | | llvm-svn: 268106
* Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to ↵Filipe Cabecinhas2016-04-291-2/+2
| | | | | | | | | | | | | | | | | | | the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050
* [X86] Remove unnecessary header file containing a small class. It was only ↵Craig Topper2016-04-292-115/+84
| | | | | | included in one place. Just define the class directly in the cpp file. NFC llvm-svn: 267985
* [X86] Include X86MCTargetDesc.h directly in X86Disassembler.cpp instead of ↵Craig Topper2016-04-291-9/+1
| | | | | | duplicating parts of it. NFC llvm-svn: 267984
* [X86] Use nested switches to vary the operand to helper functions that were ↵Craig Topper2016-04-291-43/+74
| | | | | | previously called in multiple cases. This seems to help the inliner reduce code. NFC llvm-svn: 267964
* [X86] Remove unused operand from a function and all its callers. NFCCraig Topper2016-04-285-10/+8
| | | | llvm-svn: 267854
* [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in ↵Craig Topper2016-04-281-13/+6
| | | | | | TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853
* [X86] Enable the post-RA-scheduler for clang's default 32-bit cpu.Mitch Bodart2016-04-272-12/+36
| | | | | | | | | For compilations with no explicit cpu specified, this exhibits nice gains on Silvermont, with neutral performance on big cores. Differential Revision: http://reviews.llvm.org/D19138 llvm-svn: 267809
* [X86][FastISel] Make sure we use the right register class when we select stores.Quentin Colombet2016-04-271-1/+9
| | | | llvm-svn: 267806
* [X86] Fix the lowering of TLS calls.Quentin Colombet2016-04-272-6/+9
| | | | | | | | | | | The callseq_end node must be glued with the TLS calls, otherwise, the generic code will miss the uses of the returned value and will mark it dead. Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI, the pseudo uses the symbol address at this point not RDI and the lowering will do the right thing. llvm-svn: 267797
* [X86]: Quit promoting 16 bit loads to 32 bit.Kevin B. Smith2016-04-271-17/+0
| | | | | | Differential Revision: http://reviews.llvm.org/D19592 llvm-svn: 267773
* Revert r267649, it caused PR27539.Nico Weber2016-04-271-136/+0
| | | | llvm-svn: 267723
* [X86] Set AddPristinesAndCSRs to FixupBW LivePhysRegs. NFC.Ahmed Bougacha2016-04-271-1/+2
| | | | | | | | | | | | | We run after PEI, so we need to AddPristinesAndCSRs. In practice, that makes no difference here, because we only ask about liveness of super-registers of defined GR8/GR16 registers, so they can't be pristine. Still, it's the correct thing to do. Thanks to Quentin for noticing! Follow-up to r267495. llvm-svn: 267658
* [X86] Don't assume that MMX extractelts are from index 0.Ahmed Bougacha2016-04-271-1/+3
| | | | | | | It's probably the case for all 3 MMX users out there, but with hand-crafted IR, you can trigger selection failures. Fix that. llvm-svn: 267652
* [X86] Re-enable MMX i32 extractelt combine.Ahmed Bougacha2016-04-271-10/+3
| | | | | | | | | This effectively adds back the extractelt combine removed by r262358: the direct case can still occur (because x86_mmx is special, see r262446), but it's the indirect case that's now superseded by the generic combine. llvm-svn: 267651
* Detects the SAD pattern on X86 so that much better code will be emitted once ↵Cong Hou2016-04-271-0/+136
| | | | | | | | the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649
* [X86] Make sure it is safe to clobber EFLAGS, if need be, when choosingQuentin Colombet2016-04-262-0/+16
| | | | | | | | | | | | | | | the prologue. Do not use basic blocks that have EFLAGS live-in as prologue if we need to realign the stack. Realigning the stack uses AND instruction and this clobbers EFLAGS. An other alternative would have been to save and restore EFLAGS around the stack realignment code, but this is likely inefficient. Fixes PR27531. llvm-svn: 267634
* [X86] Teach the expansion of copy instructions how to do proper liveness.Quentin Colombet2016-04-261-15/+22
| | | | | | | When the simple analysis provided by MachineBasicBlock::computeRegisterLiveness fails, fall back on the LivePhysReg utility. llvm-svn: 267623
* Optimization bisect support in X86-specific passesAndrew Kaylor2016-04-265-3/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D19439 llvm-svn: 267608
* [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.Ahmed Bougacha2016-04-261-56/+47
| | | | | | Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
* Swift Calling Convention: use %RAX for sret.Manman Ren2016-04-263-1/+16
| | | | | | | We don't need to copy the sret argument into %rax upon return. rdar://25671494 llvm-svn: 267579
* [X86] PR27502: Fix the LEA optimization pass.Andrey Turetskiy2016-04-261-2/+6
| | | | | | | | Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass. Differential Revision: http://reviews.llvm.org/D19409 llvm-svn: 267551
* [X86] Use LivePhysRegs in X86FixupBWInsts.Ahmed Bougacha2016-04-261-13/+19
| | | | | | | | | Kill-flags, which computeRegisterLiveness uses, are not reliable. LivePhysRegs is. Differential Revision: http://reviews.llvm.org/D19472 llvm-svn: 267495
* [X86] Replace a SmallVector used to pass 2 values to an ArrayRef parameter ↵Craig Topper2016-04-251-3/+1
| | | | | | with a fixed size array. NFC llvm-svn: 267377
* [X86][SSE] getTargetShuffleMaskIndices - dropped (unused) UNDEF handlingSimon Pilgrim2016-04-241-5/+0
| | | | | | We aren't currently making use of this in any successful mask decode and its actually incorrect as it inserts the wrong number of SM_SentinelUndef mask elements. llvm-svn: 267350
* [X86][SSE] Use range loop. NFCI.Simon Pilgrim2016-04-241-3/+2
| | | | llvm-svn: 267349
* [X86][XOP] Fixed VPPERM permute op decoding (PR27472).Simon Pilgrim2016-04-241-1/+1
| | | | | | Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask. llvm-svn: 267346
* [X86][SSE] Improved support for decoding target shuffle masks through bitcastsSimon Pilgrim2016-04-241-20/+26
| | | | | | | | Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm. This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472. llvm-svn: 267343
* [X86] Merge LowerCTLZ and LowerCTLZ_ZERO_UNDEF into a single function that ↵Craig Topper2016-04-241-38/+16
| | | | | | branches internally for the one difference, allowing the rest of the code to be common. NFC llvm-svn: 267331
* [X86] Node need to check if AVX512 is supported when lowering vector CTLZ. ↵Craig Topper2016-04-241-7/+5
| | | | | | The CTLZ operation is only Custom for vectors if AVX512 is enabled so if a vector gets here AVX512 is implied. NFC llvm-svn: 267330
* [X86] Remove isel patterns for selecting tzcnt/lzcnt from ↵Craig Topper2016-04-241-80/+0
| | | | | | cmove/ne+cttz/ctlz. These are folded by DAG combine now. llvm-svn: 267326
OpenPOWER on IntegriCloud