summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrCompiler.td
Commit message (Collapse)AuthorAgeFilesLines
...
* x86: NFC remove needless InstrCompiler castJF Bastien2015-08-051-15/+15
| | | | | | | | | | Summary: The casts from String to PatFrag weren't needed if we instead provided an SDNode. This fix was suggested by @pete in D11382. Subscribers: pete, llvm-commits Differential Revision: http://reviews.llvm.org/D11788 llvm-svn: 244167
* x86 atomic: optimize a.store(reg op a.load(acquire), release)JF Bastien2015-08-051-19/+56
| | | | | | | | | | | | Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations. Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11382 llvm-svn: 244128
* Use small encodings for constants when possible.Rafael Espindola2015-07-171-3/+3
| | | | llvm-svn: 242493
* Avoid a Symbol -> Name -> Symbol conversion.Rafael Espindola2015-06-221-2/+14
| | | | | | | | | | | | | | Before this we were producing a TargetExternalSymbol from a MCSymbol. That meant extracting the symbol name and fetching the symbol again down the pipeline. This patch adds a DAG.getMCSymbol that lets the MCSymbol pass unchanged on the DAG. Doing so removes the need for MO_NOPREFIX and fixes the root cause of pr23900, allowing r240130 to be committed again. llvm-svn: 240300
* AVX-512: fixed algorithm of building vectors of i1 elementsElena Demikhovsky2015-05-201-4/+5
| | | | | | | | fixed extract-insert i1 element, load i1, zextload i1 should be with "and $1, %reg" to prevent loading garbage. added a bunch of new tests. llvm-svn: 237793
* AVX-512: select operation for i1 vectorsElena Demikhovsky2015-05-121-0/+4
| | | | | | | | like: select i1 %cond, <16 x i1> %a, <16 x i1> %b. I added pseudo-CMOV patterns to resolve the "select". Added tests for KNL and SKX. llvm-svn: 237106
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-281-2/+2
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* [X86] Apply AddedComplexity consistently for similar patterns. This keeps ↵Craig Topper2015-04-041-4/+8
| | | | | | them together in the DAGISel tables and reduces table size slightly. llvm-svn: 234086
* [X86] Add a comment about the change in r234075.Craig Topper2015-04-041-0/+2
| | | | llvm-svn: 234079
* [X86] Don't use GR64 register 'and with immediate' instructions if the ↵Craig Topper2015-04-041-0/+5
| | | | | | | | | | immediate is zero in the upper 33-bits or upper 57-bits. Use GR32 instructions instead. Previously the patterns didn't have high enough priority and we would only use the GR32 form if the only the upper 32 or 56 bits were zero. Fixes PR23100. llvm-svn: 234075
* [X86] Factor out the CMOV pseudo definitions. NFCI.Ahmed Bougacha2015-02-141-125/+43
| | | | llvm-svn: 229206
* MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line ↵Benjamin Kramer2015-02-121-2/+6
| | | | | | | | with countTrailingZeros Update all callers. llvm-svn: 228930
* [X86] Convert esp-relative movs of function arguments to pushes, step 2Michael Kuperstein2015-02-011-5/+9
| | | | | | | | | | | | | | This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. (Re-commit of r227728) Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227752
* Revert r227728 due to bad line endings.Michael Kuperstein2015-02-011-1852/+1848
| | | | llvm-svn: 227746
* [X86] Convert esp-relative movs of function arguments to pushes, step 2Michael Kuperstein2015-02-011-1848/+1852
| | | | | | | | | | | | This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227728
* [x32] Change the condition from bitness to LP64 for TCRETURNdi64. Michael Kuperstein2015-01-281-2/+2
| | | | | | TCRETURNmi64, which was mistakenly changed in r227307 will wait for another day. llvm-svn: 227317
* [x32] Enable sibcall optimization on x32. Michael Kuperstein2015-01-281-5/+5
| | | | | | | | This includes two things: 1) Fix TCRETURNdi and TCRETURN64di patterns to check the right thing (LP64 as opposed to target bitness). 2) Allow LEA64_32 in MatchingStackOffset. llvm-svn: 227307
* Add the llvm.frameallocate and llvm.recoverframeallocation intrinsicsReid Kleckner2015-01-131-0/+3
| | | | | | | | | | | | | | | | | | | | | These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with the allocation may only perform one allocation, and it must be in the entry block. Functions accessing the allocation call llvm.recoverframeallocation with the function whose frame they are accessing and a frame pointer from an active call frame of that function. These intrinsics are very difficult to inline correctly, so the intention is that they be introduced rarely, or at least very late during EH preparation. Reviewers: echristo, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D6493 llvm-svn: 225746
* [X86] Make isel select the 2-byte register form of INC/DEC even in ↵Craig Topper2015-01-061-29/+12
| | | | | | | | non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering. Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness. llvm-svn: 225252
* [X86] Use 32-bit sign extended immediate for 64-bit LOCK_ArithBinOp with ↵Craig Topper2015-01-031-6/+6
| | | | | | sign extended immediate. llvm-svn: 225098
* Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files.Craig Topper2014-11-261-2/+2
| | | | llvm-svn: 222801
* [X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSextMichael Kuperstein2014-11-111-0/+1
| | | | | | | | | This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Recommitting - This time, with a hopefully working test. Differential Revision: http://reviews.llvm.org/D6128 llvm-svn: 221672
* Reverting r221626 due to a too-strict test.Michael Kuperstein2014-11-101-1/+0
| | | | llvm-svn: 221629
* [X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSextMichael Kuperstein2014-11-101-0/+1
| | | | | | | | | This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Differential Revision: http://reviews.llvm.org/D6128 llvm-svn: 221626
* [X86] Avoid generating inc/dec when slow for x.atomic_store(1 + x.atomic_load())Robin Morisset2014-10-081-2/+2
| | | | | | | | | | | | | | | | | Summary: I had forgotten to check for NotSlowIncDec in the patterns that can generate inc/dec for the above pattern (added in D4796). This currently applies to Atom Silvermont, KNL and SKX. Test Plan: New checks on atomic_mi.ll Reviewers: jfb, nadav Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5677 llvm-svn: 219336
* [x32] Fix segmented stacks supportPavel Chupin2014-09-221-5/+5
| | | | | | | | | | | | | | | | Summary: Update segmented-stacks*.ll tests with x32 target case and make corresponding changes to make them pass. Test Plan: tests updated with x32 target Reviewers: nadav, rafael, dschuff Subscribers: llvm-commits, zinovy.nis Differential Revision: http://reviews.llvm.org/D5245 llvm-svn: 218247
* [X86] Allow atomic operations using immediates to avoid using a registerRobin Morisset2014-09-021-13/+94
| | | | | | | | | | | | | | | | The only valid lowering of atomic stores in the X86 backend was mov from register to memory. As a result, storing an immediate required a useless copy of the immediate in a register. Now these can be compiled as a simple mov. Similarily, adding/and-ing/or-ing/xor-ing an immediate to an atomic location (but through an atomic_store/atomic_load, not a fetch_whatever intrinsic) can now make use of an 'add $imm, x(%rip)' instead of using a register. And the same applies to inc/dec. This second point matches the first issue identified in http://llvm.org/bugs/show_bug.cgi?id=17281 llvm-svn: 216980
* Fix failure to invoke exception handler on Win64Reid Kleckner2014-08-041-0/+2
| | | | | | | | | | | | | When the last instruction prior to a function epilogue is a call, we need to emit a nop so that the return address is not in the epilogue IP range. This is consistent with MSVC's behavior, and may be a workaround for a bug in the Win64 unwinder. Differential Revision: http://reviews.llvm.org/D4751 Patch by Vadim Chugunov! llvm-svn: 214775
* [X86] Simplify X87 stackifier pass.Akira Hatanaka2014-08-011-2/+4
| | | | | | | | | | | | | | | | | | | Stop using ST registers for function returns and inline-asm instructions and use FP registers instead. This allows removing a large amount of code in the stackifier pass that was needed to track register liveness and handle copies between ST and FP registers and function calls returning floating point values. It also fixes a bug which manifests when an ST register defined by an inline-asm instruction was live across another inline-asm instruction, as shown in the following sequence of machine instructions: 1. INLINEASM <es:frndint> $0:[regdef], %ST0<imp-def,tied5> 2. INLINEASM <es:fldcw $0> 3. %FP0<def> = COPY %ST0 <rdar://problem/16952634> llvm-svn: 214580
* Revert r213070. It's breaking the build in MCELFStreamer::EmitInstToData(...).Cameron McInally2014-07-151-6/+0
| | | | llvm-svn: 213073
* Add x86 patterns to match a specific add-with-carry. Cameron McInally2014-07-151-0/+6
| | | | llvm-svn: 213070
* X86: expand atomics in IR instead of as MachineInstrs.Tim Northover2014-07-011-77/+0
| | | | | | | | | | | | The logic for expanding atomics that aren't natively supported in terms of cmpxchg loops is much simpler to express at the IR level. It also allows the normal optimisations and CodeGen improvements to help out with atomics, instead of using a limited set of possible instructions.. rdar://problem/13496295 llvm-svn: 212119
* Re-apply r211399, "Generate native unwind info on Win64" with a fix to ↵NAKAMURA Takumi2014-06-251-0/+20
| | | | | | | | | | | | | | | | | | | | | | | ignore SEH pseudo ops in X86 JIT emitter. -- This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211691
* Reformat.NAKAMURA Takumi2014-06-251-2/+2
| | | | llvm-svn: 211689
* Revert r211399, "Generate native unwind info on Win64"NAKAMURA Takumi2014-06-221-22/+2
| | | | | | It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64. llvm-svn: 211480
* Generate native unwind info on Win64Reid Kleckner2014-06-201-2/+22
| | | | | | | | | | | | | | | | | | | | This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211399
* [X86] Use ADD/SUB instead of INC/DEC for SilvermontAlexey Volkov2014-06-091-12/+26
| | | | | | | | | | | | According to Intel Software Optimization Manual on Silvermont INC or DEC instructions require an additional uop to merge the flags. As a result, a branch instruction depending on an INC or a DEC instruction incurs a 1 cycle penalty. Differential Revision: http://reviews.llvm.org/D3990 llvm-svn: 210466
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-141-2/+2
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* [X86] Add peephole for masked rotate amountAdam Nemet2014-03-121-0/+2
| | | | | | | | | | | | | | | | Extend what's currently done for shift because the HW performs this masking implicitly: (rotl:i32 x, (and y, 31)) -> (rotl:i32 x, y) I use the newly factored out multiclass that was only supporting shifts so far. For testing I extended my testcase for the new rotation idiom. <rdar://problem/15295856> llvm-svn: 203718
* [X86] Refactor peepholes for masked shift amount into a multiclassAdam Nemet2014-03-121-55/+25
| | | | | | | | | | | | | | | | | | The peephole (shift x, (and y, 31)) -> (shift x, y) is repeated for each integer type and each shift variant. To improve this a new multiclass is added that covers all integer types. The shift patterns are now instantiated from this. I am planning to add new instances for rotates as well. No functional change intended: * test/CodeGen/X86/shift-and.ll provides coverage * Compared the expanded tablegen output and matched up the defs for these Pat<>s before and after llvm-svn: 203685
* X86: Enable ISel of 16-bit MOVBE instructions.Jim Grosbach2014-03-111-0/+6
| | | | | | | | | | | | | | | | | When the MOVBE instructions are available, use them for 16-bit endian swapping as well as for 32 and 64 bit. The patterns were already present on the instructions, but weren't being matched because the operation was unconditionally marked to 'Expand.' Change that to be conditional on whether the MOVBE instructions are available. Use 'rolw' to implement the in-register version (32 and 64 bit have the dedicated 'bswap' instruction for that). Patch by Louis Gerbarg <lgg@apple.com>. rdar://15479984 llvm-svn: 203524
* Merge x86 HasOpSizePrefix/HasOpSize16Prefix into a 2-bit OpSize field with 0 ↵Craig Topper2014-02-021-20/+20
| | | | | | meaning no 0x66 prefix in any mode. Rename Opsize16->OpSize32 and OpSize->OpSize16. The classes now refer to their operand size rather than the mode in which they need a 0x66 prefix. Hopefully can merge REX_W into this as OpSize64. llvm-svn: 200626
* [x86] Remove OpSize16 flag from MOV32r0David Woodhouse2014-01-081-2/+1
| | | | | | It's not a real instruction any more and doesn't need encoding information. llvm-svn: 198778
* [x86] Add OpSize16 to instructions that need itDavid Woodhouse2014-01-081-9/+10
| | | | | | | | | This fixes the bulk of 16-bit output, and the corresponding test case x86-16.s now looks mostly like the x86-32.s test case that it was originally based on. A few irrelevant instructions have been dropped, and there are still some corner cases to be fixed in subsequent patches. llvm-svn: 198752
* Remove opcode from MOV32r0 that I accidentally left when I converted it to ↵Craig Topper2014-01-051-2/+1
| | | | | | Pseudo. Remove FIXME as well. llvm-svn: 198564
* Handle MOV32r0 in expandPostRAPseudo instead of MCInst lowering. No ↵Craig Topper2013-12-311-2/+2
| | | | | | functional change intended. llvm-svn: 198254
* [x86] Rename In32BitMode predicate to Not64BitModeEric Christopher2013-12-201-38/+38
| | | | | | | | | | | That's what it actually means, and with 16-bit support it's going to be a little more relevant since in a few corner cases we may actually want to distinguish between 16-bit and 32-bit mode (for example the bare 'push' aliases to pushw/pushl etc.) Patch by David Woodhouse llvm-svn: 197768
* Revert "Revert "Mark vastart_save_xmm_regs as changing EFLAGS""Duncan P. N. Exon Smith2013-12-171-2/+3
| | | | | | | | | | | | | | | | | | | This reverts commit r197481, recommiting r197469 with an extra fix. The vastart_save_xmm_regs pseudo-instruction expands to a test and a branch, so it modifies EFLAGS. Mark it so, or else the scheduler might place it in the middle of another test+branch. This fixes a bug exposed by r192750, which changed the initial scheduler to source-order as part of enabling the MI Scheduler for X86. This re-commit changes the VASTART_SAVE_XMM_REGS custom inserter not to try to save %flags, and adds a test that catches the bad behavior of r197469. <rdar://problem/15627766> llvm-svn: 197503
OpenPOWER on IntegriCloud