summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* Follow-up for r217020: actually commit the fix for PR20800,Alexander Potapenko2014-09-031-3/+22
| | | | | | revert the accidentally committed changes to LLVMSymbolize.cpp llvm-svn: 217021
* Reinstate "Nuke the old JIT."Eric Christopher2014-09-0211-2235/+3
| | | | | | | | Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
* [X86] Allow atomic operations using immediates to avoid using a registerRobin Morisset2014-09-023-38/+167
| | | | | | | | | | | | | | | | The only valid lowering of atomic stores in the X86 backend was mov from register to memory. As a result, storing an immediate required a useless copy of the immediate in a register. Now these can be compiled as a simple mov. Similarily, adding/and-ing/or-ing/xor-ing an immediate to an atomic location (but through an atomic_store/atomic_load, not a fetch_whatever intrinsic) can now make use of an 'add $imm, x(%rip)' instead of using a register. And the same applies to inc/dec. This second point matches the first issue identified in http://llvm.org/bugs/show_bug.cgi?id=17281 llvm-svn: 216980
* Refactor LowerFABS and LowerFNEG into one function (x86) (NFC)Sanjay Patel2014-09-021-42/+31
| | | | | | | | | We duplicate ~30 lines of code to lower FABS and FNEG for x86, so this patch combines them into one function. No functional change intended, so no additional test cases. Test-suite behavior is unchanged. Differential Revision: http://reviews.llvm.org/D5064 llvm-svn: 216942
* CodeGen: Handle va_start in the entry blockReid Kleckner2014-09-021-2/+2
| | | | | | | | | Also fix a small copy-paste bug in X86ISelLowering where Chain should have been used in place of DAG.getEntryToken(). Fixes PR20828. llvm-svn: 216929
* CodeGen: indicate Windows unwind data formatSaleem Abdulrasool2014-09-011-0/+2
| | | | | | | | The structures for Windows unwinding are shared across multiple platforms. Indicate the encoding to be used for the particular target. Use this to switch the unwind emitter instantiated by the AsmPrinter. llvm-svn: 216895
* Use an integer constant for FABS / FNEG (x86).Sanjay Patel2014-09-011-14/+6
| | | | | | | | | | | | | | | | This change will ease refactoring LowerFABS() and LowerFNEG() since they have a lot of overlap. Remove the creation of a floating point constant from an integer because it's going to be used for a bitwise integer op anyway. No change to codegen expected, but the verbose comment string for asm output may change from float values to hex (integer), depending on whether the constant already exists or not. Differential Revision: http://reviews.llvm.org/D5052 llvm-svn: 216889
* [asan-assembly-instrumentation] Prologue and epilogue are moved out from ↵Yuri Gorshenin2014-09-014-173/+306
| | | | | | | | | | | | InstrumentMemOperand(). Reviewers: eugenis Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D4923 llvm-svn: 216879
* Revert "[asan-assembly-instrumentation] Prologue and epilogue are moved out ↵Yuri Gorshenin2014-09-013-305/+172
| | | | | | | | from InstrumentMemOperand()." This reverts commit 895aa397038b8de86d83ac0997a70949a486e112. llvm-svn: 216872
* [asan-assembly-instrumentation] Prologue and epilogue are moved out from ↵Yuri Gorshenin2014-09-013-172/+305
| | | | | | InstrumentMemOperand(). llvm-svn: 216869
* Remove 'virtual' keyword from methods markedwith 'override' keyword.Craig Topper2014-08-301-1/+1
| | | | llvm-svn: 216823
* Speculative build fix for const, gcc, and ArrayRef overloadsReid Kleckner2014-08-291-3/+3
| | | | llvm-svn: 216793
* Add a const and munge some commentsReid Kleckner2014-08-291-3/+5
| | | | llvm-svn: 216781
* musttail: Forward regparms of variadic functions on x86_64Reid Kleckner2014-08-292-71/+154
| | | | | | | | | | | | | | | | | | | | | | Summary: If a variadic function body contains a musttail call, then we copy all of the remaining register parameters into virtual registers in the function prologue. We track the virtual registers through the function body, and add them as additional registers to pass to the call. Because this is all done in virtual registers, the register allocator usually gives us good code. If the function does a call, however, it will have to spill and reload all argument registers (ew). Forwarding regparms on x86_32 is not implemented because most compilers don't support varargs in 32-bit with regparms. Reviewers: majnemer Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5060 llvm-svn: 216780
* Verifier: Don't reject varargs callee cleanup functionsReid Kleckner2014-08-291-7/+2
| | | | | | | | | | | | | | | | | | We've rejected these kinds of functions since r28405 in 2006 because it's impossible to lower the return of a callee cleanup varargs function. However there are lots of legal ways to leave such a function without returning, such as aborting. Today we can leave a function with a musttail call to another function with the correct prototype, and everything works out. I'm removing the verifier check declaring that a normal return from such a function is UB. Reviewed By: nlewycky Differential Revision: http://reviews.llvm.org/D5059 llvm-svn: 216779
* X86: Fix conflict over ESI between base register and rep;movslReid Kleckner2014-08-292-6/+36
| | | | | | | | | | | | | | The new solution is to not use this lowering if there are any dynamic allocas in the current function. We know up front if there are dynamic allocas, but we don't know if we'll need to create stack temporaries with large alignment during lowering. Conservatively assume that we will need such temporaries. Reviewed By: hans Differential Revision: http://reviews.llvm.org/D5128 llvm-svn: 216775
* [X86] Refactor X86ISelDAGToDAG::SelectAtomicLoadArith - NFCRobin Morisset2014-08-291-10/+17
| | | | | | | | | | | | | | | | | | | Summary: Mostly renaming the (not very explicit) variables Tmp0, .. Tmp4, and grouping related statements together, along with a few lines of comments for the surprising parts. No functional change intended. Test Plan: make check-all Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5088 llvm-svn: 216768
* typoSanjay Patel2014-08-291-1/+1
| | | | llvm-svn: 216732
* [SKX] Enable lowering of integer CMP operations.Robert Khasanov2014-08-291-9/+75
| | | | | | | | | | Added new types to Legalizer. Fixed getSetCCResultType function Added lowering tests. Reviewed by Elena Demikhovsky. llvm-svn: 216717
* Fix a logic bug in x86 vector codegen: sext (zext (x) ) != sext (x) (PR20472).Sanjay Patel2014-08-281-25/+11
| | | | | | | | | | | | | | | | | Remove a block of code from LowerSIGN_EXTEND_INREG() that was added with: http://llvm.org/viewvc/llvm-project?view=revision&revision=177421 And caused: http://llvm.org/bugs/show_bug.cgi?id=20472 (more analysis here) http://llvm.org/bugs/show_bug.cgi?id=18054 The testcases confirm that we (1) don't remove a zext op that is necessary and (2) generate a pmovz instead of punpck if SSE4.1 is available. Although pmovz is 1 byte longer, it allows folding of the load, and so saves 3 bytes overall. Differential Revision: http://reviews.llvm.org/D4909 llvm-svn: 216679
* [x86] Fix whitespace and formatting around this function withChandler Carruth2014-08-281-4/+5
| | | | | | clang-format, no functionality changed. llvm-svn: 216646
* [x86] Hoist conditions from *every single if* in this routine toChandler Carruth2014-08-281-12/+12
| | | | | | | | | | | a single early exit. And factor the subsequent cast<> from all but one block into a single variable. No functionality changed. llvm-svn: 216645
* [x86] Inline an SSE4 helper function for INSERT_VECTOR_ELT lowering, noChandler Carruth2014-08-281-58/+45
| | | | | | | | | | functionality changed. Separating this into two functions wasn't helping. There was a decent amount of boilerplate duplicated, and some subsequent refactorings here will pull even more common code out. llvm-svn: 216644
* Fix unaligned reads/writes in X86JIT and RuntimeDyldELF.Alexey Samsonov2014-08-271-23/+39
| | | | | | | | | | | | | | | | Summary: Introduce support::ulittleX_t::ref type to Support/Endian.h and use it in x86 JIT to enforce correct endianness and fix unaligned accesses. Test Plan: regression test suite Reviewers: lhames Subscribers: ributzka, llvm-commits Differential Revision: http://reviews.llvm.org/D5011 llvm-svn: 216631
* typo in commentSanjay Patel2014-08-271-1/+1
| | | | llvm-svn: 216609
* X86 MC: Handle instructions like fxsave that match multiple operand sizesReid Kleckner2014-08-271-8/+18
| | | | | | | | | | | | | | | | Instructions like 'fxsave' and control flow instructions like 'jne' match any operand size. The loop I added to the Intel syntax matcher assumed that using a different size would give a different instruction. Now it handles the case where we get the same instruction for different memory operand sizes. This also allows us to remove the hack we had for unsized absolute memory operands, because we can successfully match things like 'jnz' without reporting ambiguity. Removing this hack uncovered test case involving 'fadd' that was ambiguous. The memory operand could have been single or double precision. llvm-svn: 216604
* Clang-format over X86AsmInstrumentation.* with LLVM style.Evgeniy Stepanov2014-08-272-129/+132
| | | | | | r216536 mistakenly used -style=Google instead of LLVM. llvm-svn: 216543
* [x86] Fix a regression introduced with r213897 for 32-bit targets whereChandler Carruth2014-08-271-4/+2
| | | | | | | | | | | | we stopped efficiently lowering sextload using the SSE41 instructions for that operation. This is a consequence of a bad predicate I used thinking of the memory access needs. The code actually handles the cases where the predicate doesn't apply, and handles them much better. =] Simple fix and a test case added. Fixes PR20767. llvm-svn: 216538
* [SDAG] Re-instate r215611 with a fix to a pesky X86 DAG combine.Chandler Carruth2014-08-271-12/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This combine is essentially combining target-specific nodes back into target independent nodes that it "knows" will be combined yet again by a target independent DAG combine into a different set of target-independent nodes that are legal (not custom though!) and thus "ok". This seems... deeply flawed. The crux of the problem is that we don't combine un-legalized shuffles that are introduced by legalizing other operations, and thus we don't see a very profitable combine opportunity. So the backend just forces the input to that combine to re-appear. However, for this to work, the conditions detected to re-form the unlegalized nodes must be *exactly* right. Previously, failing this would have caused poor code (if you're lucky) or a crasher when we failed to select instructions. After r215611 we would fall back into the legalizer. In some cases, this just "fixed" the crasher by produces bad code. But in the test case added it caused the legalizer and the dag combiner to iterate forever. The fix is to make the alignment checking in the x86 side of things match the alignment checking in the generic DAG combine exactly. This isn't really a satisfying or principled fix, but it at least make the code work as intended. It also highlights that it would be nice to detect the availability of under aligned loads for a given type rather than bailing on this optimization. I've left a FIXME to document this. Original commit message for r215611 which covers the rest of the chang: [SDAG] Fix a case where we would iteratively legalize a node during combining by replacing it with something else but not re-process the node afterward to remove it. In a truly remarkable stroke of bad luck, this would (in the test case attached) end up getting some other node combined into it without ever getting re-processed. By adding it back on to the worklist, in addition to deleting the dead nodes more quickly we also ensure that if it *stops* being dead for any reason it makes it back through the legalizer. Without this, the test case will end up failing during instruction selection due to an and node with a type we don't have an instruction pattern for. It took many million runs of the shuffle fuzz tester to find this. llvm-svn: 216537
* Clang-format over X86AsmInstrumentation.*.Evgeniy Stepanov2014-08-272-183/+216
| | | | llvm-svn: 216536
* [SKX] Added new versions of cmp instructions in avx512_icmp_cc multiclass, ↵Robert Khasanov2014-08-271-34/+148
| | | | | | | | added VL multiclass. Added encoding tests llvm-svn: 216532
* AVX-512: Added intrinsic for VMOVSS store form with mask.Elena Demikhovsky2014-08-271-0/+10
| | | | llvm-svn: 216530
* MC: Split the x86 asm matcher implementations by dialectReid Kleckner2014-08-262-33/+198
| | | | | | | | | | | | | | | | | | | The existing matcher has lots of AT&T assembly dialect assumptions baked into it. In particular, the hack for resolving the size of a memory operand by appending the four most common suffixes doesn't work at all. The Intel assembly dialect mnemonic table has ambiguous entries, so we need to try matching multiple times with different operand sizes, since that's the only way to choose different instruction variants. This makes us more compatible with gas's implementation of Intel assembly syntax. MSVC assumes you want byte-sized operations for the instructions that we reject as ambiguous. Reviewed By: grosbach Differential Revision: http://reviews.llvm.org/D4747 llvm-svn: 216481
* [x86] Fix a bug in r216319 where I was missing a 'break'.Chandler Carruth2014-08-251-0/+2
| | | | | | | | | | | This actually was caught by existing tests but those tests were disabled with an XFAIL because of PR20736. While working on fixing that, I noticed the test failure, and tracked it down to this. We even have a really nice Clang warning that would have caught this but it isn't enabled in LLVM! =[ I may look at enabling it. llvm-svn: 216391
* [SKX] avx512_icmp_packed multiclass extensionRobert Khasanov2014-08-251-27/+173
| | | | | | | | | | | | | Extended avx512_icmp_packed multiclass by masking versions. Added avx512_icmp_packed_rmb multiclass for embedded broadcast versions. Added corresponding _vl multiclasses. Added encoding tests for CPCMP{EQ|GT}* instructions. Add more fields for X86VectorVTInfo. Added AVX512VLVectorVTInfo that include X86VectorVTInfo for 512/256/128-bit versions Differential Revision: http://reviews.llvm.org/D5024 llvm-svn: 216383
* Allow vectorization of division by uniform power of 2.Karthik Bhat2014-08-251-4/+27
| | | | | | | | This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible. Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend. Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971) llvm-svn: 216371
* Use range based for loops to avoid needing to re-mention SmallPtrSet size.Craig Topper2014-08-241-4/+2
| | | | llvm-svn: 216351
* X86 intrinsics table - simplifies intrinsics lowering.Elena Demikhovsky2014-08-242-456/+288
| | | | | | The tables are initialized when X86TargetLowering object is created. llvm-svn: 216345
* [x86] Start fixing a really subtle and terrible form of miscompile inChandler Carruth2014-08-231-28/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | these DAG combines. The DAG auto-CSE thing is truly terrible. Due to it, when RAUW-ing a node with its operand, you can cause its uses to CSE to itself, which then causes their uses to become your uses which causes them to be picked up by the RAUW. For nodes that are determined to be "no-ops", this is "fine". But if the RAUW is one of several steps to enact a transformation, this causes the DAG to really silently eat an discard nodes that you would never expect. It took days for me to actually pinpoint a test case triggering this and a really frustrating amount of time to even comprehend the bug because I never even thought about the ability of RAUW to iteratively consume nodes due to CSE-ing them into itself. To fix this, we have to build up a brand-new chain of operations any time we are combining across (potentially) intervening nodes. But once the logic is added to do this, another issue surfaces: CombineTo eagerly deletes the one node combined, *but no others*. This is... really frustrating. If deleting it makes its operands become dead, those operand nodes often won't go onto the worklist in the order you would want -- they're already on it and not near the top. That means things higher on the worklist will get combined prior to these dead nodes being GCed out of the worklist, and if the chain is long, the immediate users won't be enough to re-detect where the root of the chain is that became single-use again after deleting the dead nodes. The better way to do this is to never immediately delete nodes, and instead to just enqueue them so we can recursively delete them. The combined-from node is typically not on the worklist anyways by virtue of having been popped off.... But that in turn breaks other tests that *require* CombineTo to delete unused nodes. :: sigh :: Fortunately, there is a better way. This whole routine should have been returning the replacement rather than using CombineTo which is quite hacky. Switch to that, and all the pieces fall together. I suspect the same kind of miscompile is possible in the half-shuffle folding code, and potentially the recursive folding code. I'll be switching those over to a pattern more like this one for safety's sake even though I don't immediately have any test cases for them. Note that the only way I got a test case for this instance was with *heavily* DAG combined 256-bit shuffle sequences generated by my fuzzer. ;] llvm-svn: 216319
* ARM / x86_64 varargs: Don't save regparms in prologue without va_startReid Kleckner2014-08-221-2/+3
| | | | | | | | | | | | There's no need to do this if the user doesn't call va_start. In the future, we're going to have thunks that forward these register parameters with musttail calls, and they won't need these spills for handling va_start. Most of the test suite changes are adding va_start calls to existing tests to keep things working. llvm-svn: 216294
* Revert "X86: Align the stack on word boundaries in LowerFormalArguments()"Duncan P. N. Exon Smith2014-08-211-1/+0
| | | | | | | | | | | | | This (mostly) reverts commit r216119. Somewhere during the review Reid committed r214980 which fixed this another way, and I neglected to check that the testcase still failed before committing. I've left test/CodeGen/X86/aligned-variadic.ll around in case it adds extra coverage. llvm-svn: 216246
* Minor refactor to make applying patches from 'Add a "probe-stack" attribute' ↵Philip Reames2014-08-211-1/+5
| | | | | | review thread out of order easier. llvm-svn: 216241
* Whitespace change to reduce diff in future patch.Philip Reames2014-08-211-6/+6
| | | | | | | | Patch 2 of 11 in 'Add a "probe-stack" attribute' review thread Patch by: john.kare.alsaker@gmail.com llvm-svn: 216235
* [X86] Split out the logic to select the stack probe function (NFC)Philip Reames2014-08-212-11/+25
| | | | | | | | Patch 1 of 11 in 'Add a "probe-stack" attribute' review thread. Patch by: <john.kare.alsaker@gmail.com> llvm-svn: 216233
* [AVX512] Add class to group common template arguments related to vector typeAdam Nemet2014-08-211-18/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | We discussed the issue of generality vs. readability of the AVX512 classes recently. I proposed this approach to try to hide and centralize the mappings we commonly perform based on the vector type. A new class X86VectorVTInfo captures these. The idea is to pass an instance of this class to classes/multiclasses instead of the corresponding ValueType. Then the class/multiclass can use its field for things that derive from the type rather than passing all those as separate arguments. I modified avx512_valign to demonstrate this new approach. As you can see instead of 7 related template parameters we now have one. The downside is that we have to refer to fields for the derived values. I named the argument '_' in order to make this as invisible as possible. Please let me know if you absolutely hate this. (Also once we allow local initializations in multiclasses we can recover the original version by assigning the fields to local variables.) Another possible use-case for this class is to directly map things, e.g.: RegisterClass KRC = X86VectorVTInfo<32, i16>.KRC llvm-svn: 216209
* X86AsmPrinter MCJIT MSVC bug fix.Josh Klontz2014-08-211-6/+7
| | | | | | | | | | | | | | | | | Summary: This bug was introduced in r213006 which makes an assumption that MCSection is COFF for Windows MSVC. This assumption is broken for MCJIT users where ELF is used instead [1]. The fix is to change the MCSection cast to a dyn_cast. [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/068407.html. Reviewers: majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4872 llvm-svn: 216173
* X86: Turn redundant if into an assertion.Benjamin Kramer2014-08-211-7/+5
| | | | | | While there remove noop casts. llvm-svn: 216168
* [x86] Added _addcarry_ and _subborrow_ intrinsicsRobert Khasanov2014-08-211-1/+9
| | | | llvm-svn: 216164
* [x86] SMAP: added HasSMAP attribute for CLAC/STAC, corrected attributesRobert Khasanov2014-08-211-1/+1
| | | | llvm-svn: 216163
* [x86] Broadwell: ADOX/ADCX. Added _addcarryx_u{32|64} intrinsics to LLVM.Robert Khasanov2014-08-212-21/+52
| | | | llvm-svn: 216162
OpenPOWER on IntegriCloud