summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* De-constify pointers to Type since they can't be modified. NFCCraig Topper2015-08-012-4/+4
| | | | | | This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
* x86: check hasOpaqueSPAdjustment in canRealignStackJF Bastien2015-07-311-4/+6
| | | | | | | | | | | | | | | Summary: @rnk pointed out in [1] that x86's canRealignStack logic should match that in CantUseSP from hasBasePointer. [1]: http://reviews.llvm.org/D11160?id=29713#inline-89350 Reviewers: rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D11377 llvm-svn: 243772
* [x86] reassociate integer multiplies using machine combiner passSanjay Patel2015-07-311-10/+30
| | | | | | | | | | | | | Add i16, i32, i64 imul machine instructions to the list of reassociation candidates. A new bit of logic is needed to handle integer instructions: they have an implicit EFLAGS operand, so we have to make sure it's dead in order to do any reassociation with integer ops. Differential Revision: http://reviews.llvm.org/D11660 llvm-svn: 243756
* fix memcpy/memset/memmove lowering when optimizing for sizeSanjay Patel2015-07-301-5/+3
| | | | | | | | | | | | | | | | | | | | Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693
* [X86] Recognize "flags" as an identifier, not a register in Intel-syntax ↵Michael Kuperstein2015-07-301-0/+5
| | | | | | | | | inline asm Patch by: marina.yatsina@intel.com Differential Revision: http://reviews.llvm.org/D11512 llvm-svn: 243630
* push fast-math check for machine-combiner reassociations into ↵Sanjay Patel2015-07-301-7/+4
| | | | | | | | instruction-type check; NFC This makes it simpler to add instruction types that don't depend on fast-math. llvm-svn: 243596
* Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all ↵Nick Lewycky2015-07-292-2/+2
| | | | | | | | the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585
* Rename hasCompatibleFunctionAttributes->areInlineCompatible basedEric Christopher2015-07-292-4/+4
| | | | | | | on suggestions. Currently the function is only used for inline purposes and this is more descriptive for the use. llvm-svn: 243578
* [X86][SSE] Keep 32-bit target i64 vector shifts on SSE unit.Simon Pilgrim2015-07-291-15/+31
| | | | | | | | This patch improves the 32-bit target i64 constant matching to detect the shuffle vector splats that are introduced by i64 vector shift vectorization (D8416). Differential Revision: http://reviews.llvm.org/D11327 llvm-svn: 243577
* [X86][SSE] Vectorize i64 ASHR operationsSimon Pilgrim2015-07-292-4/+18
| | | | | | | | This patch vectorizes the v2i64/v4i64 ASHR shift operations - the last remaining integer vector shifts that are still being transferred to/from the scalar unit to be completed. Differential Revision: http://reviews.llvm.org/D11439 llvm-svn: 243569
* Revert "[PeepholeOptimizer] Look through PHIs to find additional register ↵Bruno Cardoso Lopes2015-07-291-2/+1
| | | | | | | | | | sources" Reported to Broke some internal tests: PR24303 This reverts commit r243486. llvm-svn: 243540
* fix TLI's combineRepeatedFPDivisors interface to return the minimum user ↵Sanjay Patel2015-07-282-3/+3
| | | | | | | | | | | | | | | threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498
* [PeepholeOptimizer] Look through PHIs to find additional register sourcesBruno Cardoso Lopes2015-07-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486
* Implement target independent TLS compatible with glibc's emutls.c.Chih-Hung Hsieh2015-07-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438
* [X86] Remove mergeSPUpdatesUp()Michael Kuperstein2015-07-281-25/+1
| | | | | | | | | | | X86FrameLowering has both a mergeSPUpdates() that accepts a direction, and an mergeSPUpdatesUp(), which seem to do the same thing, except for a slightly different interface. Removed the less general function. NFC. Differential Revision: http://reviews.llvm.org/D11510 llvm-svn: 243396
* [X86][SSE] Use bitmasks instead of shuffles where possible.Simon Pilgrim2015-07-281-0/+8
| | | | | | | | | | VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions. Split off from D11518. Differential Revision: http://reviews.llvm.org/D11541 llvm-svn: 243395
* AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructionsIgor Breger2015-07-283-1/+8
| | | | | | | | Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11528 llvm-svn: 243390
* fix invalid load folding with SSE/AVX FP logical instructions (PR22371)Sanjay Patel2015-07-283-46/+59
| | | | | | | | | | | | | | | | | | This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ). I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy when targeting 32-bit x86 and causes a miscompile/crash in Wine: https://bugs.winehq.org/show_bug.cgi?id=38826 https://llvm.org/bugs/show_bug.cgi?id=22371#c25 The quick fix is to simply remove the scalar FP logical instructions from the load folding table in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs, fneg, fcopysign. So the majority of this patch is altering those lowerings to use *vector* FP logical instructions (because that's all x86 gives us anyway). That lets us do the load folding legally. Differential Revision: http://reviews.llvm.org/D11477 llvm-svn: 243361
* [llvm-mc] Pushing plumbing through for --fatal-warnings flag.Colin LeMahieu2015-07-271-1/+1
| | | | llvm-svn: 243334
* remove unnecessary forward declaration; NFCSanjay Patel2015-07-271-16/+12
| | | | llvm-svn: 243328
* don't repeat function names in comments; NFCSanjay Patel2015-07-271-61/+49
| | | | llvm-svn: 243327
* Revert "[PeepholeOptimizer] Look through PHIs to find additional register ↵Bruno Cardoso Lopes2015-07-271-2/+1
| | | | | | | | sources" Still breaks some ARM buildbots. This reverts r243271. llvm-svn: 243318
* fix typo and spacing; NFCSanjay Patel2015-07-271-2/+2
| | | | llvm-svn: 243287
* Revert "Add const to some Type* parameters which didn't need to be mutable. ↵Pete Cooper2015-07-271-5/+5
| | | | | | | | | | NFC." This reverts commit r243146. Feedback from Craig Topper and David Blaikie was that we don't put const on Type as it has no mutable state. llvm-svn: 243282
* [PeepholeOptimizer] Look through PHIs to find additional register sourcesBruno Cardoso Lopes2015-07-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply r242295 with fixes in the implementation. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243271
* [X86] Reordered lowerVectorShuffleAsBitMask before ↵Simon Pilgrim2015-07-271-86/+86
| | | | | | | | lowerVectorShuffleAsBlend. NFCI. Allows us to show diffs for D11518 more clearly llvm-svn: 243264
* Avoid using uncommon acronym "MSROM".Sean Silva2015-07-271-2/+2
| | | | llvm-svn: 243256
* Implemented encoding and intrinsics of the following instructionsIgor Breger2015-07-263-96/+134
| | | | | | | | | | vunpckhps/pd, vunpcklps/pd, vpunpcklbw, vpunpckhbw, vpunpcklwd, vpunpckhwd, vpunpckldq, vpunpckhdq, vpunpcklqdq, vpunpckhqdq Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11509 llvm-svn: 243246
* Add const to some Type* parameters which didn't need to be mutable. NFC.Pete Cooper2015-07-241-5/+5
| | | | | | | We were only getting the size of the type which doesn't need to modify the type. llvm-svn: 243146
* Use foreach loops for StructType::elements(). NFC.Pete Cooper2015-07-241-2/+2
| | | | | | | | | | | | We had a few places where we did for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) { but those could instead do for (auto *EltTy : STy->elements()) { llvm-svn: 243136
* AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer ↵Igor Breger2015-07-246-108/+522
| | | | | | | | | | Truncate with/without saturation Added tests for DAG lowering ,encoding and intrinsic Differential Revision: http://reviews.llvm.org/D11218 llvm-svn: 243122
* Remove access to the DataLayout in the TargetMachineMehdi Amini2015-07-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: Replace getDataLayout() with a createDataLayout() method to make explicit that it is intended to create a DataLayout only and not accessing it for other purpose. This change is the last of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11103 (cherry picked from commit 5609fc56bca971e5a7efeaa6ca4676638eaec5ea) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243114
* fix wrong comment; NFCSanjay Patel2015-07-241-1/+1
| | | | llvm-svn: 243113
* Revert "Remove access to the DataLayout in the TargetMachine"Mehdi Amini2015-07-241-6/+4
| | | | | | | | | | This reverts commit 0f720d984f419c747709462f7476dff962c0bc41. It breaks clang too badly, I need to prepare a proper patch for clang first. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243089
* Remove access to the DataLayout in the TargetMachineMehdi Amini2015-07-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: Replace getDataLayout() with a createDataLayout() method to make explicit that it is intended to create a DataLayout only and not accessing it for other purpose. This change is the last of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11103 (cherry picked from commit 5609fc56bca971e5a7efeaa6ca4676638eaec5ea) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243083
* X86: Use dyn_cast instead of isa+cast, NFCDuncan P. N. Exon Smith2015-07-231-5/+6
| | | | llvm-svn: 243034
* [X86] Allow load folding into PUSH instructionsMichael Kuperstein2015-07-233-9/+27
| | | | | | | | | | Adds pushes to the folding tables. This also required a fix to the TD definition, since the memory forms of the push instructions did not have the right mayLoad/mayStore flags. Differential Revision: http://reviews.llvm.org/D11340 llvm-svn: 243010
* [X86] Fix order of operands for ins and outs instructions when parsing intel ↵Michael Kuperstein2015-07-231-28/+28
| | | | | | | | | syntax Patch by: marina.yatsina@intel.com Differential Revision: http://reviews.llvm.org/D11337 llvm-svn: 243001
* X86: Fixed assertion failure in 32-bit modeElena Demikhovsky2015-07-231-2/+3
| | | | | | | | | | The DAG Node "SCALAR_TO_VECTOR" may be created if the type of the scalar element is legal. Added a check for the scalar type before creating this node. Added a test that fails with assertion on the current version. Differential Revision: http://reviews.llvm.org/D11413 llvm-svn: 242994
* Revert r242990: "AVX-512: Implemented encoding , DAG lowering and ..."Chandler Carruth2015-07-235-478/+94
| | | | | | | | | | This commit broke the build. Numerous build bots broken, and it was blocking my progress so reverting. It should be trivial to reproduce -- enable the BPF backend and it should fail when running llvm-tblgen. llvm-svn: 242992
* AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer ↵Igor Breger2015-07-235-94/+478
| | | | | | | | | | Truncate with/without saturation Added tests for DAG lowering ,encoding and intrinsic Differential Revision: http://reviews.llvm.org/D11218 llvm-svn: 242990
* AVX : Fix ISA disabling in case AVX512VL , some instructions should be ↵Igor Breger2015-07-232-17/+18
| | | | | | | | | | disabled only if AVX512BW and AVX512VL present. Tests added. Differential Revision: http://reviews.llvm.org/D11414 llvm-svn: 242987
* fix typo; NFCSanjay Patel2015-07-221-4/+4
| | | | llvm-svn: 242947
* fix indent; NFCSanjay Patel2015-07-221-20/+20
| | | | llvm-svn: 242946
* Fix -Wextra-semi warnings.Hans Wennborg2015-07-222-2/+2
| | | | | | | | Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D11400 llvm-svn: 242930
* [X86][AVX512] add reduce/range/scalef/rndScaleAsaf Badouh2015-07-225-90/+203
| | | | | | | | include encoding and intrinsics Differential Revision: http://reviews.llvm.org/D11222 llvm-svn: 242896
* [X86] Add .intel_syntax noprefix directive to intel-syntax x86 asm outputMichael Kuperstein2015-07-221-0/+1
| | | | | | | Patch by: michael.zuckerman@intel.com Differential Revision: http://reviews.llvm.org/D11223 llvm-svn: 242886
* AVX-512: Added intrinsics for VCVT* instructions.Elena Demikhovsky2015-07-222-7/+181
| | | | | | | | All SKX forms. All VCVT instructions for float/double/int/long types. Differential Revision: http://reviews.llvm.org/D11343 llvm-svn: 242877
* AVX512 : Implemented VPMADDUBSW and VPMADDWD instruction , Igor Breger2015-07-215-1/+38
| | | | | | | | Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11351 llvm-svn: 242761
* Targets: commonize some stack realignment codeJF Bastien2015-07-203-29/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute. Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has. The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes. The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation. `needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11160 llvm-svn: 242727
OpenPOWER on IntegriCloud