| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.
llvm-svn: 147342
|
| |
|
|
|
|
| |
on its own without disabling SSE4.2 or SSE4A.
llvm-svn: 147339
|
| |
|
|
|
|
| |
for v16i16 and v32i8.
llvm-svn: 147337
|
| |
|
|
| |
llvm-svn: 147336
|
| |
|
|
|
|
| |
Add same assert on similar code path.
llvm-svn: 147335
|
| |
|
|
|
|
| |
floating-point types. PR11674.
llvm-svn: 147323
|
| |
|
|
|
|
|
| |
Matching MOVLP mask for AVX (265-bit vectors) was wrong.
The failure was detected by conformance tests.
llvm-svn: 147308
|
| |
|
|
| |
llvm-svn: 147289
|
| |
|
|
|
|
| |
consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't.
llvm-svn: 147287
|
| |
|
|
| |
llvm-svn: 147269
|
| |
|
|
|
|
| |
x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4.
llvm-svn: 147252
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LZCNT instructions are available. Force promotion to i32 to get
a smaller encoding since the fix-ups necessary are just as complex for
either promoted type
We can't do standard promotion for CTLZ when lowering through BSR
because it results in poor code surrounding the 'xor' at the end of this
instruction. Essentially, if we promote the entire CTLZ node to i32, we
end up doing the xor on a 32-bit CTLZ implementation, and then
subtracting appropriately to get back to an i8 value. Instead, our
custom logic just uses the knowledge of the incoming size to compute
a perfect xor. I'd love to know of a way to fix this, but so far I'm
drawing a blank. I suspect the legalizer could be more clever and/or it
could collude with the DAG combiner, but how... ;]
llvm-svn: 147251
|
| |
|
|
|
|
| |
inspection earlier.
llvm-svn: 147250
|
| |
|
|
| |
llvm-svn: 147247
|
| |
|
|
|
|
|
|
|
|
|
| |
'bsf' instructions here.
This one is actually debatable to my eyes. It's not clear that any chip
implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless
EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding.
Still, this restores the old behavior with 'tzcnt' enabled for now.
llvm-svn: 147246
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:
(sizeof(x)*8 - 1) ^ __builtin_clz(x)
Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.
The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.
Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.
These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.
llvm-svn: 147244
|
| |
|
|
| |
llvm-svn: 147238
|
| |
|
|
|
|
| |
loadRegFromStackSlot.
llvm-svn: 147235
|
| |
|
|
| |
llvm-svn: 147234
|
| |
|
|
| |
llvm-svn: 147232
|
| |
|
|
| |
llvm-svn: 147231
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
ARM targets with NEON units have access to aligned vector loads and
stores that are potentially faster than unaligned operations.
Add support for spilling the callee-saved NEON registers to an aligned
stack area using 16-byte aligned NEON loads and store.
This feature is off by default, controlled by an -align-neon-spills
command line option.
llvm-svn: 147211
|
| |
|
|
|
|
|
|
|
|
|
| |
My change r146949 added register clobbers to the eh_sjlj_dispatchsetup pseudo
instruction, but on Thumb1 some of those registers cannot be used. This
caused massive failures on the testsuite when compiling for Thumb1. While
fixing that, I noticed that the eh_sjlj_setjmp instruction has a "nofp"
variant, and I realized that dispatchsetup needs the same thing, so I have
added that as well.
llvm-svn: 147204
|
| |
|
|
| |
llvm-svn: 147192
|
| |
|
|
|
|
| |
rdar://10558523
llvm-svn: 147189
|
| |
|
|
|
|
| |
Noticed by inspection; I don't have a testcase for this.
llvm-svn: 147188
|
| |
|
|
| |
llvm-svn: 147184
|
| |
|
|
|
|
| |
Fixes PR11214.
llvm-svn: 147180
|
| |
|
|
|
|
|
|
|
|
| |
The value from the operands isn't right yet, but we weren't encoding it at
all previously. The parser needs to twiddle the values when building the
instruction.
Partial for: rdar://10558523
llvm-svn: 147170
|
| |
|
|
| |
llvm-svn: 147169
|
| |
|
|
| |
llvm-svn: 147158
|
| |
|
|
|
|
|
|
| |
reporting
it. It does need some some tests...
llvm-svn: 147154
|
| |
|
|
|
|
|
|
| |
Just treat it as-if the shift wasn't there at all. 'as' compatibility.
rdar://10604767
llvm-svn: 147153
|
| |
|
|
| |
llvm-svn: 147152
|
| |
|
|
| |
llvm-svn: 147151
|
| |
|
|
| |
llvm-svn: 147150
|
| |
|
|
| |
llvm-svn: 147133
|
| |
|
|
| |
llvm-svn: 147132
|
| |
|
|
| |
llvm-svn: 147129
|
| |
|
|
| |
llvm-svn: 147126
|
| |
|
|
| |
llvm-svn: 147124
|
| |
|
|
| |
llvm-svn: 147121
|
| |
|
|
| |
llvm-svn: 147119
|
| |
|
|
|
|
|
|
| |
ELF relocations.
Patch by Jack Carter.
llvm-svn: 147118
|
| |
|
|
| |
llvm-svn: 147117
|
| |
|
|
| |
llvm-svn: 147115
|
| |
|
|
| |
llvm-svn: 147109
|
| |
|
|
| |
llvm-svn: 147104
|
| |
|
|
| |
llvm-svn: 147103
|
| |
|
|
| |
llvm-svn: 147102
|