summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* Use MCSymbols for FastISel.Rafael Espindola2015-06-231-3/+4
| | | | | | | | | | | The summary is that it moves the mangling earlier and replaces a few calls to .addExternalSymbol with addSym. I originally wanted to replace all the uses of addExternalSymbol with addSym, but noticed it was a lot of work and doesn't need to be done all at once. llvm-svn: 240395
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-2336-46/+46
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* AVX-512: Added all forms of VPABS instructionElena Demikhovsky2015-06-235-71/+105
| | | | | | Added all intrinsics, tests for encoding, tests for intrinsics. llvm-svn: 240386
* [x86] generalize reassociation optimization in machine combiner to 2 ↵Sanjay Patel2015-06-231-77/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instructions Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass to reassociate the following sequence to reduce the critical path: A = ? op ? B = A op X C = B op Y --> A = ? op ? B = X op Y C = A op B 'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could be any associative math/logic op (see TODO in code comment). This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth reassociating them. This generalization has a compile-time cost because we can now match more instruction sequences and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't improve the critical path. For example, in the new test case: A = M div N B = A add X C = B add Y We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path: A = M div N B = A add Y C = B add X We need the combiner to reject that pattern but select this: A = M div N B = X add Y C = B add A Differential Revision: http://reviews.llvm.org/D10460 llvm-svn: 240361
* [X86][FMA4] FMA4 ops can perform unaligned folded loads.Simon Pilgrim2015-06-221-64/+64
| | | | llvm-svn: 240342
* [X86] Teach load folding to accept scalar _Int users of MOVSS/MOVSD.Ahmed Bougacha2015-06-221-10/+46
| | | | | | | | | | | | | | | | | | | The _Int instructions are special, in that they operate on the full VR128 instead of FR32. The load folding then looks at MOVSS, at the user, and bails out when it sees a size mismatch. What we really know is that the rm_Int instructions don't load the higher lanes, so folding is fine. This happens for the straightforward intrinsic code, e.g.: _mm_add_ss(a, _mm_load_ss(p)); Fixes PR23349. Differential Revision: http://reviews.llvm.org/D10554 llvm-svn: 240326
* [x86] set default reciprocal (division and square root) codegen to match GCCSanjay Patel2015-06-221-6/+9
| | | | | | | | | | | | | | | | D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line options to allow reciprocal estimate instructions to be used in place of divisions and square roots. This patch changes the default settings for x86 targets to allow that recip codegen (except for scalar division because that breaks too much code) when using -ffast-math or its equivalent. This matches GCC behavior for this kind of codegen. Differential Revision: http://reviews.llvm.org/D10396 llvm-svn: 240310
* Avoid a Symbol -> Name -> Symbol conversion.Rafael Espindola2015-06-225-33/+47
| | | | | | | | | | | | | | Before this we were producing a TargetExternalSymbol from a MCSymbol. That meant extracting the symbol name and fetching the symbol again down the pipeline. This patch adds a DAG.getMCSymbol that lets the MCSymbol pass unchanged on the DAG. Doing so removes the need for MO_NOPREFIX and fixes the root cause of pr23900, allowing r240130 to be committed again. llvm-svn: 240300
* AVX-512: added VPSHUFB instruction - all SKX formsElena Demikhovsky2015-06-222-0/+19
| | | | | | Added intrinsics and encoding tests. llvm-svn: 240277
* AVX-512: All forms of VCOPMRESS VEXPAND instructions,Elena Demikhovsky2015-06-223-79/+32
| | | | | | encoding tests. llvm-svn: 240272
* Reverted AVX-512 vector shuffleElena Demikhovsky2015-06-221-180/+64
| | | | llvm-svn: 240258
* [X86] Allow more call sequences to use push instructions for argument passingMichael Kuperstein2015-06-221-26/+91
| | | | | | | | | This allows more call sequences to use pushes instead of movs when optimizing for size. In particular, calling conventions that pass some parameters in registers (e.g. thiscall) are now supported. Differential Revision: http://reviews.llvm.org/D10500 llvm-svn: 240257
* AVX-512: Added intrinsics for VPERMT2W/D/Q/PS/PD andElena Demikhovsky2015-06-222-3/+123
| | | | | | | VPERMI2W/D/Q/PS/PD instructions. Added tests. llvm-svn: 240256
* [X86] Code tidyup - Use SDValue bool operator. NFC.Simon Pilgrim2015-06-211-47/+25
| | | | llvm-svn: 240249
* [X86][SSE] Fix PerformSExtCombine bug that accessed the wrong return value ↵Simon Pilgrim2015-06-201-3/+3
| | | | | | | | of an aggregate type. Fix to rL237885 to ensure that it accesses the correct return value of an aggregate type. llvm-svn: 240223
* name change: hasPattern() -> getMachineCombinerPatterns() ; NFCSanjay Patel2015-06-192-6/+6
| | | | | | | This was suggested as part of D10460, but it's independent of any functional change. llvm-svn: 240192
* Improve error handling of getRelocationAddend.Rafael Espindola2015-06-191-1/+2
| | | | | | | | | | | | | | | | | | This patch changes getRelocationAddend to use ErrorOr and considers it an error to try to get the addend of a REL section. If, for example, a x86_64 file has a REL section, that file is corrupted and we should reject it. Using ErrorOr is not ideal since we check the section type once per relocation instead of once per section. Checking once per section would involve getRelocationAddend just asserting and callers checking the section before iterating over the relocations. In any case, this is an improvement and includes a test. llvm-svn: 240176
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-1936-46/+46
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Fix "the the" in comments.Eric Christopher2015-06-193-3/+3
| | | | llvm-svn: 240112
* use SDValue bool operator; NFCISanjay Patel2015-06-181-3/+2
| | | | llvm-svn: 240064
* [X86] Rename RegInfo to TRI as suggested by EricReid Kleckner2015-06-182-39/+39
| | | | llvm-svn: 240047
* [X86] Refactor stack adjustments into X86FrameLowering::BuildStackAdjustmentReid Kleckner2015-06-183-107/+93
| | | | | | | | Deduplicates some code and lets us use LEA on atom when adjusting the stack around callee-cleanup calls. This is the only intended functionality change. llvm-svn: 240044
* [X86] Remove unneeded parameters and deduplicate stack alignment codeReid Kleckner2015-06-183-76/+67
| | | | | | NFC llvm-svn: 240033
* quick fix for failure from r.240012Asaf Badouh2015-06-181-0/+1
| | | | | | | failure: http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/11847/steps/build_Lld/logs/stdio llvm-svn: 240015
* [AVX512]Asaf Badouh2015-06-184-2/+11
| | | | | | | | | | add instructions: VPAVGB and VPAVGW review http://reviews.llvm.org/D10504 llvm-svn: 240012
* AVX-512: (fixed) Added encoding of all forms of VPERMT2W/D/Q/PS/PD and ↵Elena Demikhovsky2015-06-181-107/+76
| | | | | | | | VPERMI2W/D/Q/PS/PD. Intrinsics and tests for them are comming in the next patch. llvm-svn: 240003
* reverted 239999 due to test failuresElena Demikhovsky2015-06-181-71/+107
| | | | llvm-svn: 240001
* AVX-512: Added encoding of all forms of VPERMT2W/D/Q/PS/PDElena Demikhovsky2015-06-181-107/+71
| | | | | | | and VPERMI2W/D/Q/PS/PD. Intrinsics and tests for them are comming in the next patch. llvm-svn: 239999
* [X86][SSE] Improved support for vector i16 to float conversions.Simon Pilgrim2015-06-171-8/+9
| | | | | | | | Added explicit sign extension for v4i16/v8i16 to v4i32/v8i32 before conversion to floats. Matches existing support for v4i8/v8i8. Follow up to D10433 llvm-svn: 239966
* Re-land "[X86] Cache variables that only depend on the subtarget"Reid Kleckner2015-06-175-89/+64
| | | | | | Re-instates r239949 without accidentally flipping the sense of UseLEA. llvm-svn: 239950
* Revert "[X86] Cache variables that only depend on the subtarget"Reid Kleckner2015-06-175-64/+89
| | | | | | This reverts commit r239948, tests seem to be failing. llvm-svn: 239949
* [X86] Cache variables that only depend on the subtargetReid Kleckner2015-06-175-89/+64
| | | | | | | | | | | | | | There is a one-to-one relationship between X86Subtarget and X86FrameLowering, but every frame lowering method would previously pull the subtarget off the MachineFunction and query some subtarget properties. Over time, these locals began to grow in complexity and it became important to keep their names and meaning in sync across all of the frame lowering methods, leading to duplication. We can eliminate that duplication by computing them once in the constructor. llvm-svn: 239948
* Move the personality function from LandingPadInst to FunctionDavid Majnemer2015-06-171-8/+2
| | | | | | | | | | | | | | | | | | | The personality routine currently lives in the LandingPadInst. This isn't desirable because: - All LandingPadInsts in the same function must have the same personality routine. This means that each LandingPadInst beyond the first has an operand which produces no additional information. - There is ongoing work to introduce EH IR constructs other than LandingPadInst. Moving the personality routine off of any one particular Instruction and onto the parent function seems a lot better than have N different places a personality function can sneak onto an exceptional function. Differential Revision: http://reviews.llvm.org/D10429 llvm-svn: 239940
* Move IsUsedInReloc from MCSymbolELF to MCSymbol.Rafael Espindola2015-06-171-1/+1
| | | | | | There is a free bit is MCSymbol and MachO needs the same information. llvm-svn: 239933
* AVX-512: cvtusi2ss/d intrinsics.Igor Breger2015-06-172-35/+59
| | | | | | | | | Change builtin function name and signature ( add third parameter - rounding mode ). Added tests for intrinsics. Differential Revision: http://reviews.llvm.org/D10473 llvm-svn: 239888
* [X86][SSE] Vectorize v2i32 to v2f64 conversionsSimon Pilgrim2015-06-164-4/+32
| | | | | | | | This patch enables support for the conversion of v2i32 to v2f64 to use the CVTDQ2PD xmm instruction and stay on the SSE unit instead of scalarizing, sign extending to i64 and using CVTSI2SDQ scalar conversions. Differential Revision: http://reviews.llvm.org/D10433 llvm-svn: 239855
* [X86] Rename some frame lowering variablesReid Kleckner2015-06-161-26/+28
| | | | | | | | | | | | | | | | Old names, new names, and what they really mean: - IsWin64 -> IsWin64CC: This is true on non-Windows x86_64 platforms when the ms_abi calling convention is used. - IsWinEH -> IsWin64Prologue: True when the target is Win64, regardless of calling convention. Changes the prologue to obey the constraints of the Win64 unwinder. - NeedsWinEH -> NeedsWinCFI: We're using the win64 prologue *and* the we want .xdata unwind tables. Analogous to NeedsDwarfCFI. NFC llvm-svn: 239836
* Clean up redundant copies of Triple objects. NFCDaniel Sanders2015-06-164-9/+9
| | | | | | | | | | | | | | Summary: Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10382 llvm-svn: 239823
* [AVX512] add integer min/max intrinsics support.Asaf Badouh2015-06-162-24/+48
| | | | | | | review: http://reviews.llvm.org/D10439 llvm-svn: 239806
* X86: optimized i64 vector multiply with constantElena Demikhovsky2015-06-161-5/+11
| | | | | | | | | When we multiply two 64-bit vectors, we extract lower and upper part and use the PMULUDQ instruction. When one of the operands is a constant, the upper part may be zero, we know this at compile time. Example: %a = mul <4 x i64> %b, <4 x i64> < i64 5, i64 5, i64 5, i64 5>. I'm checking the value of the upper part and prevent redundant "multiply", "shift" and "add" operations. llvm-svn: 239802
* [X86] Try to shorten dwarf CFI emissionReid Kleckner2015-06-151-28/+23
| | | | llvm-svn: 239786
* [TargetInstrInfo] Add new hook: AnalyzeBranchPredicate.Sanjoy Das2015-06-152-5/+95
| | | | | | | | | | | | | | | | | | | Summary: NFC: no one uses AnalyzeBranchPredicate yet. Add TargetInstrInfo::AnalyzeBranchPredicate and implement for x86. A later change adding support for page-fault based implicit null checks depends on this. Reviewers: reames, ab, atrick Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10200 llvm-svn: 239742
* [TargetInstrInfo] Rename getLdStBaseRegImmOfs and implement for x86.Sanjoy Das2015-06-152-0/+34
| | | | | | | | | | | | | | | | | | | | | | | Summary: TargetInstrInfo::getLdStBaseRegImmOfs to TargetInstrInfo::getMemOpBaseRegImmOfs and implement for x86. The implementation only handles a few easy cases now and will be made more sophisticated in the future. This is NFCI: the only user of `getLdStBaseRegImmOfs` (now `getmemOpBaseRegImmOfs`) is `LoadClusterMotion` and `LoadClusterMotion` is disabled for x86. Reviewers: reames, ab, MatzeB, atrick Reviewed By: MatzeB, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10199 llvm-svn: 239741
* [CodeGen] Introduce a FAULTING_LOAD_OP pseudo-op.Sanjoy Das2015-06-153-2/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This instruction encodes a loading operation that may fault, and a label to branch to if the load page-faults. The locations of potentially faulting loads and their "handler" destinations are recorded in a FaultMap section, meant to be consumed by LLVM's clients. Nothing generates FAULTING_LOAD_OP instructions yet, but they will be used in a future change. The documentation (FaultMaps.rst) needs improvement and I will update this diff with a more expanded version shortly. Depends on D10196 Reviewers: rnk, reames, AndyAyers, ab, atrick, pgavlin Reviewed By: atrick, pgavlin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10197 llvm-svn: 239740
* [NFC] Extract X86MCInstLower::LowerMachineOperand.Sanjoy Das2015-06-151-38/+37
| | | | | | | | | | | | | | Summary: Refactoring-only change that will be used later. Reviewers: reames, atrick Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10196 llvm-svn: 239739
* AVX-512: Implemented DAG lowering for shuff62x2/shufi62x2 instuctions ( ↵Igor Breger2015-06-141-0/+28
| | | | | | | | | | Shuffle Packed Values at 128-bit Granularity ) Tests added , vector-shuffle-512-v8.ll test re-generated. Differential Revision: http://reviews.llvm.org/D10300 llvm-svn: 239697
* Add support for parsing the XOR operator in Intel syntax inline assembly.Michael Kuperstein2015-06-141-12/+39
| | | | | | | Differential Revision: http://reviews.llvm.org/D10385 Patch by marina.yatsina@intel.com llvm-svn: 239695
* AVX-512: Implemented cvtsi2ss/d cvtusi2ss/d instructions with round control ↵Igor Breger2015-06-145-15/+48
| | | | | | | | | | | for KNL. Added intrinsics for cvtsi2ss/d instructions. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D10430 llvm-svn: 239694
* Stripped trailing whitespace. NFC.Simon Pilgrim2015-06-131-6/+6
| | | | llvm-svn: 239672
* MachineLICM: Use TargetSchedModel instead of just itinerariesMatthias Braun2015-06-132-2/+2
| | | | | | | | | This will use Itinieraries if available, but will also work if just a MCSchedModel is available. Differential Revision: http://reviews.llvm.org/D10428 llvm-svn: 239658
OpenPOWER on IntegriCloud