summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Optimization for replacing LEA with MOV at frame index elimination timeZvi Rackover2016-09-261-10/+10
| | | | | | | | | | | | | | | Summary: Replace a LEA instruction of the form 'lea (%esp), %ebx' --> 'mov %esp, %ebx' MOV is preferable over LEA because usually there are more issue-slots available to execute MOVs than LEAs. Latest processors also support zero-latency MOVs. Fixes pr29022. Reviewers: hfinkel, delena, igorb, myatsina, mkuper Differential Revision: https://reviews.llvm.org/D24705 llvm-svn: 282385
* [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using ↵Simon Pilgrim2016-07-191-11/+7
| | | | | | | | | | | | | | | | | | generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. A companion clang patch is at D22105 Differential Revision: https://reviews.llvm.org/D22106 llvm-svn: 275981
* Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setccMichael Kuperstein2016-07-071-16/+16
| | | | | | | | | | | xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD) which was not appreciated by fast regalloc on 32-bit. llvm-svn: 274802
* Revert r274692 to check whether this is what breaks windows selfhost.Michael Kuperstein2016-07-071-16/+16
| | | | llvm-svn: 274771
* [X86] Transform setcc + movzbl into xorl + setccMichael Kuperstein2016-07-061-16/+16
| | | | | | | | | | | xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. Differential Revision: http://reviews.llvm.org/D21774 llvm-svn: 274692
* [x86, SSE] update packed FP compare tests for direct translation from ↵Sanjay Patel2016-06-151-13/+36
| | | | | | | | | | | | | builtin to IR The clang side of this was r272840: http://reviews.llvm.org/rL272840 A follow-up step would be to auto-upgrade and remove these LLVM intrinsics completely. Differential Revision: http://reviews.llvm.org/D21269 llvm-svn: 272841
* [X86][SSE] Updated storeu fast-isel tests to match clang builtin testsSimon Pilgrim2016-05-301-3/+2
| | | | | | Since rL271214 the headers have no longer used the storeu intrinsic llvm-svn: 271222
* [x86] avoid printing unnecessary sign bits of hex immediates in asm comments ↵Sanjay Patel2016-05-281-6/+6
| | | | | | | | | | | (PR20347) It would be better to check the valid/expected size of the immediate operand, but this is generally better than what we print right now. Differential Revision: http://reviews.llvm.org/D20385 llvm-svn: 271114
* [X86][SSE] Use storeu intrinsics for _mm_storeu_ps testSimon Pilgrim2016-05-251-2/+3
| | | | llvm-svn: 270680
* [X86][SSE] Added fast-isel tests to sync with clang/test/CodeGen/sse-builtins.cSimon Pilgrim2016-05-191-0/+2280
llvm-svn: 270081
OpenPOWER on IntegriCloud