| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 224730
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, when ctpop is supported for scalar types, the expansion of
@llvm.ctpop.vXiY uses vector element extractions, insertions and individual
calls to @llvm.ctpop.iY. When not, expansion with bit-math operations is used
for the scalar calls.
Local haswell measurements show that we can improve vector @llvm.ctpop.vXiY
expansion in some cases by using a using a vector parallel bit twiddling
approach, based on:
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = ((v + (v >> 4) & 0xF0F0F0F)
v = v + (v >> 8)
v = v + (v >> 16)
v = v & 0x0000003F
(from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel)
When scalar ctpop isn't supported, the approach above performs better for
v2i64, v4i32, v4i64 and v8i32 (see numbers below). And even when scalar ctpop
is supported, this approach performs ~2x better for v8i32.
Here, x86_64 implies -march=corei7-avx without ctpop and x86_64h includes ctpop
support with -march=core-avx2.
== [x86_64h - new]
v8i32: 0.661685
v4i32: 0.514678
v4i64: 0.652009
v2i64: 0.324289
== [x86_64h - old]
v8i32: 1.29578
v4i32: 0.528807
v4i64: 0.65981
v2i64: 0.330707
== [x86_64 - new]
v8i32: 1.003
v4i32: 0.656273
v4i64: 1.11711
v2i64: 0.754064
== [x86_64 - old]
v8i32: 2.34886
v4i32: 1.72053
v4i64: 1.41086
v2i64: 1.0244
More work for other vector types will come next.
llvm-svn: 224725
|
| |
|
|
| |
llvm-svn: 224722
|
| |
|
|
|
|
| |
Patch by Ramkumar Ramachandra <artagnon@gmail.com>
llvm-svn: 224720
|
| |
|
|
|
|
|
|
|
| |
generate instructions.
Fixes PR21978.
Related to <rdar://problem/18310086>
llvm-svn: 224717
|
| |
|
|
|
|
| |
intrinsics, encoding tests for AVX-512F and skx instructions.
llvm-svn: 224707
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch pattern matches code such as-
neg w8, w8
mul w8, w9, w8
to
mneg w8, w8, w9
Review: http://reviews.llvm.org/D6754
llvm-svn: 224706
|
| |
|
|
|
|
|
| |
In resent times asan and valgrind have found way more memory management bugs
in llvm than the special purpose leak detector.
llvm-svn: 224703
|
| |
|
|
|
|
| |
Clean up some style related things in the StackProtector CodeGen. NFC.
llvm-svn: 224693
|
| |
|
|
|
|
| |
from patterns for the 32-bit version.
llvm-svn: 224692
|
| |
|
|
|
|
|
|
|
| |
Extend the existing code which handles this for zext. This makes this
more useful for targets with ZeroOrNegativeOne BooleanContent and
obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne)
since the constant will now be shrunk to i1.
llvm-svn: 224691
|
| |
|
|
| |
llvm-svn: 224687
|
| |
|
|
|
|
| |
printing of the alias instead of the real instruction.
llvm-svn: 224686
|
| |
|
|
| |
llvm-svn: 224685
|
| |
|
|
|
|
| |
call/jump in Intel syntax.
llvm-svn: 224684
|
| |
|
|
|
|
| |
Use range-based for loop and constify the iterators. NFC.
llvm-svn: 224683
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARM ARM states:
LDM/LDMIA/LDMFD:
The SP can be in the list. However, ARM deprecates using these instructions
with SP in the list.
ARM deprecates using these instructions with both the LR and the PC in the
list.
LDMDA/LDMFA/LDMDB/LDMEA/LDMIB/LDMED:
The SP can be in the list. However, instructions that include the SP in the
list are deprecated.
Instructions that include both the LR and the PC in the list are deprecated.
POP:
The SP can only be in the list before ARMv7. ARM deprecates any use of ARM
instructions that include the SP, and the value of the SP after such an
instruction is UNKNOWN.
ARM deprecates the use of this instruction with both the LR and the PC in
the list.
Attempt to diagnose use of deprecated forms of these instructions. This mirrors
the previous changes to diagnose use of the deprecated forms of STM in ARM mode.
llvm-svn: 224682
|
| |
|
|
| |
llvm-svn: 224678
|
| |
|
|
|
|
|
|
|
| |
(X & INT_MIN) == 0 ? X ^ INT_MIN : X into X | INT_MIN
(X & INT_MIN) != 0 ? X ^ INT_MIN : X into X & INT_MAX
This fixes PR21993.
llvm-svn: 224676
|
| |
|
|
|
|
|
| |
getScalarSizeInBits returns zero when the comparison operands are not
integral. No functionality change intended.
llvm-svn: 224675
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 224673
|
| |
|
|
|
|
|
|
|
| |
(X & INT_MIN) ? X & INT_MAX : X into X & INT_MAX
(X & INT_MIN) ? X : X & INT_MAX into X
(X & INT_MIN) ? X | INT_MIN : X into X
(X & INT_MIN) ? X : X | INT_MIN into X | INT_MIN
llvm-svn: 224669
|
| |
|
|
|
|
|
|
|
|
|
|
| |
much of the glory of clang-format, and now any time I touch it I risk
introducing formatting changes as part of a functional commit.
Also, clang-format is *way* better at formatting my code than I am.
Most of this is a huge improvement although I reverted a couple of
places where I hit a clang-format bug with lambdas that has been filed
but not (fully) fixed.
llvm-svn: 224666
|
| |
|
|
|
|
|
|
|
| |
We must not add kill flags when reading a vreg with some undefined
subregisters, if subreg liveness tracking is enabled. This is because
the register allocator may reuse these undefined subregisters for other
values which are not killed.
llvm-svn: 224664
|
| |
|
|
|
|
|
|
| |
- Use more const modifiers
- Use references for things that can't be nullptr
- Improve some variable names
llvm-svn: 224663
|
| |
|
|
| |
llvm-svn: 224655
|
| |
|
|
| |
llvm-svn: 224650
|
| |
|
|
| |
llvm-svn: 224648
|
| |
|
|
|
|
|
| |
The codegen failed on 128-bit types on AVX2.
I added patterns and in td files and tests.
llvm-svn: 224647
|
| |
|
|
|
|
|
| |
If the condition is used for something else, this increases
the number of instructions.
llvm-svn: 224646
|
| |
|
|
|
|
| |
No functionality change.
llvm-svn: 224635
|
| |
|
|
|
|
| |
-private-headers.
llvm-svn: 224627
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is intended to be used for a family of personality functions that
have similar IR preparation requirements. Typically when interoperating
with MSVC personality functions, bits of functionality need to be
outlined from the main function into helper functions. There is also
usually more than one landing pad per invoke, which does not match the
LLVM IR landingpad representation.
None of this is implemented yet. This change just adds a new enum that
is active for *-windows-msvc and delegates to the EH removal preparation
pass. No functionality change for other targets.
llvm-svn: 224625
|
| |
|
|
|
|
|
|
|
|
|
| |
destination (PR14221)
This is a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ).
That patch started to fix PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ), but it was not completed.
Differential Revision: http://reviews.llvm.org/D6330
llvm-svn: 224624
|
| |
|
|
|
|
|
| |
The constant bus restrictions only apply to VALU instructions. This
enables SIFoldOperands to fold immediates into SALU instructions.
llvm-svn: 224623
|
| |
|
|
|
|
|
|
| |
mubuf instructions now define the soffset field using the SCSrc_32
register class which indicates that only SGPRs and inline constants
are allowed.
llvm-svn: 224622
|
| |
|
|
| |
llvm-svn: 224621
|
| |
|
|
|
|
| |
-private-headers.
llvm-svn: 224616
|
| |
|
|
| |
llvm-svn: 224612
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a path to DAGCombiner::MergeConsecutiveStores()
to combine multiple scalar stores when the store operands
are extracted vector elements. This is a partial fix for
PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ).
For the new test case, codegen improves from:
vmovss %xmm0, (%rdi)
vextractps $1, %xmm0, 4(%rdi)
vextractps $2, %xmm0, 8(%rdi)
vextractps $3, %xmm0, 12(%rdi)
vextractf128 $1, %ymm0, %xmm0
vmovss %xmm0, 16(%rdi)
vextractps $1, %xmm0, 20(%rdi)
vextractps $2, %xmm0, 24(%rdi)
vextractps $3, %xmm0, 28(%rdi)
vzeroupper
retq
To:
vmovups %ymm0, (%rdi)
vzeroupper
retq
Patch reviewed by Nadav Rotem.
Differential Revision: http://reviews.llvm.org/D6698
llvm-svn: 224611
|
| |
|
|
| |
llvm-svn: 224610
|
| |
|
|
| |
llvm-svn: 224609
|
| |
|
|
| |
llvm-svn: 224608
|
| |
|
|
|
|
| |
-private-headers.
llvm-svn: 224607
|
| |
|
|
| |
llvm-svn: 224604
|
| |
|
|
| |
llvm-svn: 224599
|
| |
|
|
|
|
| |
register class.
llvm-svn: 224598
|
| |
|
|
|
|
|
|
|
|
| |
dsymutil needs access to DWARF specific inforamtion, the small DIContext
wrapper isn't sufficient. Other DWARF consumers might want to use it too
(I'm looking at you lldb).
Differential Revision: http://reviews.llvm.org/D6694
llvm-svn: 224594
|
| |
|
|
|
|
| |
Found by the Clang static analyzer.
llvm-svn: 224590
|
| |
|
|
|
|
| |
Found by the Clang static analyzer.
llvm-svn: 224589
|