summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
* X86: Prefer using VPSHUFD over VPERMIL because it has better throughput.Nadav Rotem2012-12-073-5/+5
| | | | llvm-svn: 169624
* Fix a bug in the code that merges consecutive stores. Previously we did notNadav Rotem2012-12-061-0/+23
| | | | | | | check if loads that happen in between stores alias with the first store in the chain, only with the second store onwards. llvm-svn: 169516
* Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing ↵Craig Topper2012-12-061-1/+4
| | | | | | to the normal instructions. llvm-svn: 169482
* RegisterPressureTracker: fix findUseBetween to handle DebugValueAndrew Trick2012-12-051-0/+49
| | | | llvm-svn: 169427
* RegisterPresssureTracker: Track live physical register by unit.Andrew Trick2012-12-051-0/+30
| | | | | | | | This is much simpler to reason about, more efficient, and fixes some corner cases involving implicit super-register defs. Fixed rdar://12797931. llvm-svn: 169425
* Simplified BLEND pattern matching for shuffles.Elena Demikhovsky2012-12-052-6/+53
| | | | | | Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. llvm-svn: 169366
* Add x86 isel lowering logic to form bit test with inverted condition. e.g.Evan Cheng2012-12-051-3/+97
| | | | | | | | | x ^ -1. Patch by David Majnemer. rdar://12755626 llvm-svn: 169339
* Use the 'count' attribute to calculate the upper bound of an array.Bill Wendling2012-12-046-6/+6
| | | | | | | | | The count attribute is more accurate with regards to the size of an array. It also obviates the upper bound attribute in the subrange. We can also better handle an unbound array by setting the count to -1 instead of the lower bound to 1 and upper bound to 0. llvm-svn: 169312
* Add a 'count' field to the DWARF subrange.Bill Wendling2012-12-045-5/+5
| | | | | | | | | The count field is necessary because there isn't a difference between the 'lo' and 'hi' attributes for a one-element array and a zero-element array. When the count is '0', we know that this is a zero-element array. When it's >=1, then it's a normal constant sized array. When it's -1, then the array is unbounded. llvm-svn: 169218
* Allow merging multiple store sequences on the same chain.Nadav Rotem2012-12-021-0/+31
| | | | llvm-svn: 169111
* Fix an invalid regex in the testEli Bendersky2012-12-021-1/+1
| | | | llvm-svn: 169108
* misched: Fix RegisterPressureTracker handling of DebugVals.Andrew Trick2012-12-011-0/+43
| | | | | | | Assertion failed: (TopRPTracker.getPos() == RegionBegin && "bad initial Top tracker"). rdar://12790302. llvm-svn: 169072
* misched: Fix the DAG builder to handle an undef operand at ExitSU.Andrew Trick2012-12-011-0/+26
| | | | | | | Assertion failed: (VNI && "No value to read by operand") rdar://12790267. llvm-svn: 169071
* misched: Fix LiveInterval update to better handle DebugVal.Andrew Trick2012-12-011-0/+50
| | | | | | | Assertion failed: (itr != mi2iMap.end() && "Instruction not found in maps.") rdar://12777252. llvm-svn: 169070
* misched: fix RegionBegin when DebugValues get shuffled to the top.Andrew Trick2012-12-011-0/+85
| | | | | | | | assert (RemainingInstrs == 0 && "Instruction count mismatch!") rdar://12776937. llvm-svn: 169069
* When combining consecutive stores allow loads in between the stores, if the ↵Nadav Rotem2012-11-291-0/+52
| | | | | | loads do not alias. llvm-svn: 168832
* misched: Analysis that partitions the DAG into subtrees.Andrew Trick2012-11-281-0/+68
| | | | | | | | | | | This is a simple, cheap infrastructure for analyzing the shape of a DAG. It recognizes uniform DAGs that take the shape of bottom-up subtrees, such as the included matrix multiplication example. This is useful for heuristics that balance register pressure with ILP. Two canonical expressions of the heuristic are implemented in scheduling modes: -misched-ilpmin and -misched-ilpmax. llvm-svn: 168773
* misched: better alias analysis.Andrew Trick2012-11-281-0/+127
| | | | | | | | | | | | | This fixes a hole in the "cheap" alias analysis logic implemented within the DAG builder itself, regardless of whether proper alias analysis is enabled. It now handles this pattern produced by LSR+CodeGenPrepare. %sunkaddr1 = ptrtoint * %obj to i64 %sunkaddr2 = add i64 %sunkaddr1, %lsr.iv %sunkaddr3 = inttoptr i64 %sunkaddr2 to i32* store i32 %v, i32* %sunkaddr3 llvm-svn: 168768
* X86: do not fold load instructions such as [V]MOVS[S|D] to other instructionsManman Ren2012-11-271-0/+39
| | | | | | | | | | | when the destination register is wider than the memory load. These load instructions load from m32 or m64 and set the upper bits to zero, while the folded instructions may accept m128. rdar://12721174 llvm-svn: 168710
* Revert accidental commit.Craig Topper2012-11-271-0/+42
| | | | llvm-svn: 168687
* Make PrintReg constructor explicit to prevent weird implicit conversions ↵Craig Topper2012-11-271-42/+0
| | | | | | from accidentally being triggered. llvm-svn: 168686
* Add test cases for r168417.Craig Topper2012-11-271-0/+20
| | | | llvm-svn: 168681
* llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Loosen expression ↵NAKAMURA Takumi2012-11-271-1/+1
| | | | | | corresponding to r168627. Win32 and *bsd were affected. llvm-svn: 168651
* Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary.Chad Rosier2012-11-262-25/+19
| | | | | | | | | | | | | | | This pass was conservative in that it always reserved the FP to enable dynamic stack realignment, which allowed the RA to use aligned spills for vector registers. This happens even when spills were not necessary. The RA has since been improved to use unaligned spills when necessary. The new behavior is to realign the stack if the frame pointer was already reserved for some other reason, but don't reserve the frame pointer just because a function contains vector virtual registers. Part of rdar://12719844 llvm-svn: 168627
* Normalize splat 256bit vectors with 8 elements.Jakub Staszak2012-11-261-7/+7
| | | | llvm-svn: 168600
* Intel OCL built-ins calling conventions now support MacOS 32-bit.Elena Demikhovsky2012-11-201-36/+58
| | | | llvm-svn: 168359
* Handle mixed normal and early-clobber defs on inline asm.Jakob Stoklund Olesen2012-11-191-0/+7
| | | | | | PR14376. llvm-svn: 168320
* llvm/test/CodeGen/X86/hipe-cc*.ll: Add explicit -mcpu, or they don't expect ↵NAKAMURA Takumi2012-11-162-2/+2
| | | | | | to pass on Atom. llvm-svn: 168171
* Add the Erlang/HiPE calling convention, patch by Yiannis Tsiouris.Duncan Sands2012-11-162-0/+164
| | | | llvm-svn: 168166
* Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of ↵Craig Topper2012-11-161-0/+144
| | | | | | vector types. llvm-svn: 168141
* Make sure to not get AVX code on an AVX-capable host. Revealed in r167967.Jakub Staszak2012-11-144-8/+8
| | | | llvm-svn: 167989
* llvm/test/CodeGen/X86/memset.ll: FileCheck-ize, and add another case on +avx.NAKAMURA Takumi2012-11-141-2/+23
| | | | llvm-svn: 167975
* Force CPU in test so we don't accidentally get AVX code on an AVX-capable host.Benjamin Kramer2012-11-141-2/+2
| | | | llvm-svn: 167973
* X86: Enable SSE memory intrinsics even when stack alignment is less than 16 ↵Benjamin Kramer2012-11-144-22/+79
| | | | | | | | | | | | | | | | | | bytes. The stack realignment code was fixed to work when there is stack realignment and a dynamic alloca is present so this shouldn't cause correctness issues anymore. Note that this also enables generation of AVX instructions for memset under the assumptions: - Unaligned loads/stores are always fast on CPUs supporting AVX - AVX is not slower than SSE We may need some tweaked heuristics if one of those assumptions turns out not to be true. Effectively reverts r58317. Part of PR2962. llvm-svn: 167967
* Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333.Rafael Espindola2012-11-141-0/+12
| | | | llvm-svn: 167912
* Revert "Use the 'count' attribute instead of the 'upper_bound' attribute."Eric Christopher2012-11-131-2/+2
| | | | | | | | temporarily as it is breaking the gdb bots. This reverts commit r167806/e7ff4c14b157746b3e0228d2dce9f70712d1c126. llvm-svn: 167886
* X86: when constructing VZEXT_LOAD from other loads, makes sure its outputManman Ren2012-11-131-0/+51
| | | | | | | | | | | chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://12684358 llvm-svn: 167859
* Use the 'count' attribute instead of the 'upper_bound' attribute.Bill Wendling2012-11-131-2/+2
| | | | | | | | | If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the same for both of them because we use the 'upper_bound' attribute. Instead use the 'count' attrbute, which gives the correct number of elements in the array. <rdar://problem/12566646> llvm-svn: 167806
* Fix test case added in patch fixing PR14314Michael Liao2012-11-121-4/+4
| | | | llvm-svn: 167769
* Fix PR14314Michael Liao2012-11-122-4/+17
| | | | | | | - Fix operand order for atomic sub, where the minuend is the value loaded from memory and the subtrahend is the parameter specified. llvm-svn: 167718
* Cleanup pcmp(e/i)str(m/i) instruction definitions and load folding support.Craig Topper2012-11-101-5/+47
| | | | llvm-svn: 167652
* Add support of RTM from TSX extensionMichael Liao2012-11-081-0/+30
| | | | | | | | - Add RTM code generation support throught 3 X86 intrinsics: xbegin()/xend() to start/end a transaction region, and xabort() to abort a tranaction region llvm-svn: 167573
* misched: Heuristics based on the machine model.Andrew Trick2012-11-071-0/+230
| | | | | | | | | | | | | | misched is disabled by default. With -enable-misched, these heuristics balance the schedule to simultaneously avoid saturating processor resources, expose ILP, and minimize register pressure. I've been analyzing the performance of these heuristics on everything in the llvm test suite in addition to a few other benchmarks. I would like each heuristic check to be verified by a unit test, but I'm still trying to figure out the best way to do that. The heuristics are still in considerable flux, but as they are refined we should be rigorous about unit testing the improvements. llvm-svn: 167527
* test/CodeGen/X86/fp-fast.ll: Add +avx.NAKAMURA Takumi2012-11-011-1/+1
| | | | llvm-svn: 167207
* Add a few more simple fast-math constant propagations and cancellations.Owen Anderson2012-11-011-0/+20
| | | | llvm-svn: 167200
* (For X86) Enhancement to add-carray/sub-borrow (adc/sbb) optimization.Shuxin Yang2012-10-312-1/+13
| | | | | | | | | | | | The adc/sbb optimization is to able to convert following expression into a single adc/sbb instruction: (ult) ... = x + 1 // where the ult is unsigned-less-than comparison (ult) ... = x - 1 This change is to flip the "x >u y" (i.e. ugt comparison) in order to expose the adc/sbb opportunity. llvm-svn: 167180
* X86 SSE: update rsqrtss and rcpss to use two source operands andManman Ren2012-10-301-0/+36
| | | | | | | | | | | | the first source operand is tied to the destination operand. This is to accurately model the corresponding instructions where the upper bits are unmodified. rdar://12558838 PR14221 llvm-svn: 167064
* X86 MMX: optimize transfer from mmx to i32Manman Ren2012-10-301-0/+14
| | | | | | | | | We used to generate a store (movq) + a load. Now we use movd. rdar://9946746 llvm-svn: 167056
* Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chanceJakub Staszak2012-10-301-5/+9
| | | | | | to test it with chapni's fix (-mattr=+avx). llvm-svn: 166985
* Revert r166971. It causes buildbot failure. To be investigated.Jakub Staszak2012-10-291-9/+5
| | | | llvm-svn: 166979
OpenPOWER on IntegriCloud