summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* Set the floating point status register as reservedPengfei Wang2019-11-031-0/+3
| | | | | | | | | | | | | | | | | | | | Summary: This patch sets the FPSW (X87 floating-point status register) as a reserved physical register and fix the test failure caused by [[ https://reviews.llvm.org/D68854| D68854 ]]. Before this patch, some tests will fail because it implicit uses FPSW without define it. Setting the FPSW as a reserved physical register will skip liveness analysis because it is always live. Reviewers: pengfei, craig.topper Reviewed By: craig.topper Subscribers: craig.topper, hiraditya, llvm-commits Patch by LiuChen. Differential Revision: https://reviews.llvm.org/D69784
* [X86][SSE] combineX86ShufflesRecursively - at Depth==0, only resolve ↵Simon Pilgrim2019-11-031-6/+31
| | | | | | | | KnownZero if it removes an input. This stops infinite loops where KnownUndef elements are converted to Zeroable, resulting in KnownZero elements which are then simplified (via SimplifyDemandedElts etc.) back to KnownUndef elements........ Prep fix for PR43024 which will allow rL368307 to be re-applied.
* [X86][SSE] combineX86ShufflesRecursively - don't bother merging shuffles ↵Simon Pilgrim2019-11-031-92/+105
| | | | | | with empty roots. NFCI. This doesn't affect actual codegen, but is a minor refactor toward fixing PR43024 where we need to avoid excess changes (folding zeroables etc.) to the shuffle mask at Depth == 0.
* [X86] Convert PICStyles::Style to scoped enum class. NFCI.Simon Pilgrim2019-11-032-11/+11
| | | | Fixes MSVC static analyzer warnings about enum safety, this enum performs no integer math so it'd be better to fix its scope.
* Fix uninitialized variable warning. NFCI.Simon Pilgrim2019-11-031-1/+1
|
* isImmPCRel/isImmSigned - both functions should return bool not unsigned. NFCI.Simon Pilgrim2019-11-021-2/+2
|
* X86_MC::createX86MCSubtargetInfo - X86_MC::ParseX86Triple never returns an ↵Simon Pilgrim2019-11-021-6/+3
| | | | | | empty string. NFCI. PVS Studio was complaining that the expression '!ArchFS.empty()' is always true.
* X86Operand::print - fix SymName shadow variable warning. NFCI.Simon Pilgrim2019-11-021-2/+2
|
* X86AsmPrinter - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-021-2/+2
|
* [X86] Move computeZeroableShuffleElements before ↵Simon Pilgrim2019-11-021-87/+87
| | | | | | getTargetShuffleAndZeroables. NFCI. Prep work toward merging some of the functionality.
* [X86] Remove FeatureSSE3 from the implies list of HasFastHorizontalOps.Craig Topper2019-11-011-1/+1
| | | | HasFastHorizontalOps is a tuning flag. It shouldn't imply an ISA flag.
* [X86] Model MXCSR for MMX FP instructionsPengfei Wang2019-11-011-5/+5
| | | | | | | | | | | | | | | | Summary: This patch models MXCSR and adds flag "mayRaiseFPException" for MMX FP instructions. Reviewers: craig.topper, andrew.w.kaylor, RKSimon, cameron.mcinally Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D69702
* [X86] add mayRaiseFPException flag and FPCW registers for X87 instructionsPengfei Wang2019-11-012-25/+46
| | | | | | | | | | | | | | | | Summary: This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise float exception. Reviewers: pengfei, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally, craig.topper Reviewed By: craig.topper Subscribers: thakis, hiraditya, llvm-commits Patch by LiuChen. Differential Revision: https://reviews.llvm.org/D68854
* [X86] Change the behavior of canWidenShuffleElements used by ↵Craig Topper2019-11-011-19/+14
| | | | | | | | | | | lowerV2X128Shuffle to match the behavior in lowerVectorShuffle with regards to zeroable elements. Previously we marked zeroable elements in a way that prevented the widening check from recognizing that it could widen. Now we only mark them zeroable if V2 is an all zeros vector. This matches what we do for widening elements in lowerVectorShuffle. Fixes PR43866.
* [X86][AVX] Add support for and/or scalar bool reduction with AVX512 mask ↵Simon Pilgrim2019-11-011-0/+6
| | | | | | registers combineBitcastvxi1 only handles bitcast->MOVMSK combines, with mask registers we use BITCAST directly.
* [X86] isFNEG - use switch() instead of if-else tree. NFCI.Simon Pilgrim2019-11-011-33/+36
| | | | In a future patch this will avoid some checks which don't need to be done for some opcodes.
* [X86] Reland: Enable YMM memcmp with AVX1David Zarzycki2019-11-011-3/+2
| | | | | | | | | | Update TargetTransformInfo to allow AVX1 to use YMM registers for memcmp. This is a follow up to D68632 which enabled XOR compares which made this possible. This also updates the memcmp-optsize.ll test unlike the first patch. https://reviews.llvm.org/D69658
* Revert "[X86] add mayRaiseFPException flag and FPCW registers for X87 ↵Nico Weber2019-10-312-46/+25
| | | | | | | instructions" This reverts commit a678677da498a45f59c16ee74fea438e34a801ce. It broke CodeGen/ms-inline-asm.c on most bots.
* [X86] add mayRaiseFPException flag and FPCW registers for X87 instructionsCraig Topper2019-10-312-25/+46
| | | | | | | | | This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise float exception. Patch by LiuChen. With a couple small fixes from me. Differential Revision: https://reviews.llvm.org/D68854
* [X86] Remove FSIN/FCOS isel patterns and the pseudo instructions that they ↵Craig Topper2019-10-313-13/+3
| | | | | | | selected for the FP stackifier. We always expand these to libcalls so get rid of the last vestiges of using the instructions.
* Revert rG0e252ae19ff8d99a59d64442c38eeafa5825d441 : [X86] Enable YMM memcmp ↵Simon Pilgrim2019-10-311-2/+3
| | | | | | | | with AVX1 Breaks build bots Differential Revision: https://reviews.llvm.org/D69658
* [X86] Enable YMM memcmp with AVX1David Zarzycki2019-10-311-3/+2
| | | | | | | | Update TargetTransformInfo to allow AVX1 to use YMM registers for memcmp. This is a follow up to D68632 which enabled XOR compares which made this possible. https://reviews.llvm.org/D69658
* Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional ↵Simon Pilgrim2019-10-312-9/+13
| | | | | | destination and source pair as a return value; NFC This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375
* [TII] Use optional destination and source pair as a return value; NFCDjordje Todorovic2019-10-312-13/+9
| | | | | | | | | | Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622
* [X86][SSE] Convert computeZeroableShuffleElements to emit KnownUndef and ↵Simon Pilgrim2019-10-311-23/+35
| | | | KnownZero
* [cfi] Add flag to always generate .debug_frameDavid Candler2019-10-312-12/+6
| | | | | | | | | This adds a flag to LLVM and clang to always generate a .debug_frame section, even if other debug information is not being generated. In situations where .eh_frame would normally be emitted, both .debug_frame and .eh_frame will be used. Differential Revision: https://reviews.llvm.org/D67216
* [X86] Model MXCSR for all SSE instructionsCraig Topper2019-10-304-51/+97
| | | | | | | | | | | | | | | This patch adds MXCSR as a reserved physical register and models its use by X86 SSE instructions. It also adds flag "mayRaiseFPException" for the instructions that possibly can raise FP exception according to the architecture definition. Following what SystemZ and other targets does, only the current rounding modes and the IEEE exception masks are modeled. *Changes* of the MXCSR due to exceptions are not modeled. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D68121
* [X86] Rewrite hasReassociableOperands and setSpecialOperandAttr to not ↵Craig Topper2019-10-301-28/+19
| | | | | | hardcode number of operands or position of the EFLAGS operand. This makes the code immune to the MXCSR addition in D68121.
* [X86] Add FIXME comment to merge more of computeZeroableShuffleElements and ↵Simon Pilgrim2019-10-301-0/+1
| | | | getTargetShuffleAndZeroables
* [X86][SSE] combineX86ShuffleChain - use resolveZeroablesFromTargetShuffle ↵Simon Pilgrim2019-10-301-4/+3
| | | | helper. NFCI.
* [X86] combineOrShiftToFunnelShift - use isOperationLegalOrCustom to check ↵Simon Pilgrim2019-10-301-1/+2
| | | | | | FSHL/FSHR support Remove hard wired legality check.
* [X86] combineOrShiftToFunnelShift - use getShiftAmountTy instead of ↵Simon Pilgrim2019-10-301-5/+8
| | | | hardwiring to MVT::i8
* [Alignment] Use Align for TFI.getStackAlignment() in X86ISelLoweringGuillaume Chatelet2019-10-301-26/+18
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, craig.topper, rnk Reviewed By: rnk Subscribers: rnk, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69034
* [X86] Make memcmp vector lowering handle arbitrary expansionsDavid Zarzycki2019-10-301-23/+43
| | | | | | | | | | Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp expansions but do not change any default policy for now. This also fixes a bug in the memcmp expansion itself when large displacements are needed. https://reviews.llvm.org/D69507
* [SelectionDAG] Enable lowering unordered atomics loads w/LoadSDNode (and ↵Philip Reames2019-10-291-1/+1
| | | | | | | | | | stores w/StoreSDNode) by default Enable the new SelectionDAG representation for unordered loads and stores introduced in r371441 by default. As a reminder, the new lowering changes the representation of an unordered atomic load from an AtomicSDNode - which is essentially a black box which gets passed through without combines messing with it - to a LoadSDNode w/a atomic marker on the MMO. The later parallels the way we handle volatiles, and I've audited the code to ensure that every location which checks one checks the other. This has been fairly heavily fuzzed, and I examined diffs in a reasonable large corpus of assembly by hand, so I'm reasonable sure this is correct for the common case. Late in the review for this, it was discovered that I hadn't correctly handled cases which could be legalized into CAS operations. This points out that there's a strong bias in the IR of the frontend I'm working with towards only legal atomics. If there are problems with this patch, the most likely area will be legalization. Differential Revision: https://reviews.llvm.org/D69219
* [X86] Narrow i64 compares with constant to i32 when the upper 32-bits are ↵Craig Topper2019-10-291-5/+17
| | | | | | | | | | | | | | | | known zero. This catches some cases. There are probably ways to improve this. I tried doing it as a combine on the setcc, but that broke some cases involving flag reuse in place of test. I renamed the isX86CCUnsigned to isX86CCSigned and flipped its polarity to make it consistent with the similar functions for ISD::SETCC. This avoids calling EQ/NE as being signed or unsigned. Fixes PR43823. Differential Revision: https://reviews.llvm.org/D69499
* [X86] Pull out combineOrShiftToFunnelShift helper. NFCI.Simon Pilgrim2019-10-291-51/+64
|
* Fix a spelling mistake in a comment. NFCGreg Bedwell2019-10-291-1/+1
| | | | | (I'm currently trying to debug a strange error message I get when pushing to github, despite the pushes being successful).
* [X86] Add a DAG combine to turn (and (bitcast (vXi1 (concat_vectors (vYi1 ↵Craig Topper2019-10-281-0/+68
| | | | | | | | | | setcc), undef,))), C) into (bitcast (vXi1 (concat_vectors (vYi1 setcc), zero,))) The legalization of v2i1->i2 or v4i1->i4 bitcasts followed by a setcc can create an and after the bitcast. If we're lucky enough that the input to the bitcast is a concat_vectors where the first operand is a setcc that can natively 0 all the upper bits of ak-register, then we should replace the other operands of the concat_vectors with zero in order to remove the AND. With the AND removed we might be able to use a kortest on the result. Differential Revision: https://reviews.llvm.org/D69205
* Add Windows Control Flow Guard checks (/guard:cf).Andrew Paverd2019-10-286-3/+39
| | | | | | | | | | | | | | | | | | | Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761
* [X86] Fix 48/96 byte memcmp code genDavid Zarzycki2019-10-281-2/+21
| | | | | | | Detect scalar ISD::ZERO_EXTEND generated by memcmp lowering and convert it to ISD::INSERT_SUBVECTOR. https://reviews.llvm.org/D69464
* [X86] Use 64-bit version of source register in LowerPATCHABLE_EVENT_CALL and ↵Craig Topper2019-10-271-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | LowerPATCHABLE_TYPED_EVENT_CALL Summary: The PATCHABLE_EVENT_CALL uses i32 in the intrinsic. This results in the register allocator picking a 32-bit register. We need to use the 64-bit register when forming the MOV64rr instructions. Otherwise we print illegal assembly in the text output. I think prior to this it was impossible for SrcReg to be equal to DstReg so the NOP code was not reachable. While there use Register instead of unsigned. Also add a FIXME for what looks like a bug. Reviewers: dberris Reviewed By: dberris Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69365
* [X86] Only look up boolean reduction cost tables if the reduction is not ↵Craig Topper2019-10-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | pairwise. Summary: We don't pattern match pairwise shuffles in SelectionDAG. So we should only return the optimized costs if its not a pairwise shuffle. I think SLP vectorizer gives priority to non pairwise shuffle if the cost is the same. And the look up for reduction intrinsics passes false for the pairwise flag. So this probably has no real effect today. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69083
* [X86] Prefer KORTEST on Knights Landing or later for memcmp()David Zarzycki2019-10-264-17/+60
| | | | | | | | | | | PTEST and especially the MOVMSK instructions are slow on Knights Landing or later. As a bonus, this patch increases instruction parallelism by emitting: KORTEST(PCMPNEQ(a, b), PCMPNEQ(c, d)) == 0 Instead of: KORTEST(AND(PCMPEQ(a, b), PCMPEQ(c, d))) == ~0 https://reviews.llvm.org/D69157
* [X86][GISel] Fix typo in comment. NFCCraig Topper2019-10-261-1/+1
|
* [Alignment][NFC] getMemoryOpCost uses MaybeAlignGuillaume Chatelet2019-10-252-11/+12
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
* [X86] Add a check for SSE2 to the top of combineReductionToHorizontal.Craig Topper2019-10-251-0/+4
| | | | Without this, we can create a PSADBW node that isn't legal.
* [X86][GISel] Remove unneeded custom selection code for handling shifts.Craig Topper2019-10-241-78/+0
|
* [X86] combineX86ShufflesRecursively - assert the root mask is legal. NFCI.Simon Pilgrim2019-10-231-0/+3
|
* [Mips] Use appropriate private label prefix based on Mips ABIMirko Brkusanin2019-10-231-1/+2
| | | | | | | | | | MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64 regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo we can find out Mips ABI and pick appropriate prefix. Tags: #llvm, #clang, #lldb Differential Revision: https://reviews.llvm.org/D66795
OpenPOWER on IntegriCloud