| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 337200
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the last step needed to fix PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325
We're trading branch and compares for loads and logic ops.
This makes the code smaller and hopefully faster in most cases.
The 24-byte test shows an interesting construct: we load the trailing scalar
elements into vector registers and generate the same pcmpeq+movmsk code that
we expected for a pair of full vector elements (see the 32- and 64-byte tests).
Differential Revision: https://reviews.llvm.org/D41714
llvm-svn: 321934
|
|
|
|
|
|
|
|
|
|
|
| |
loads per block; NFC
The preference only applies to 'memcmp() == 0' expansion, so try to make that clearer.
x86 will likely benefit by increasing the default value from '1' to '2' as seen in PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325
...so that is the planned follow-up to this clean-up step.
llvm-svn: 321756
|
|
|
|
| |
llvm-svn: 320960
|
|
|
|
| |
llvm-svn: 320619
|
|
|
|
|
|
|
|
| |
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
llvm-svn: 318490
|
|
Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/.
llvm-svn: 317318
|