| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
Enabling HasAVX512{DQ,BW,VL} predicates.
Adding VK2, VK4, VK32, VK64 masked register classes.
Adding new types (v64i8, v32i16) to VR512.
Extending calling conventions for new types (v64i8, v32i16)
Patch by Zinovy Nis <zinovy.y.nis@intel.com>
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>
llvm-svn: 213545
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic for expanding atomics that aren't natively supported in
terms of cmpxchg loops is much simpler to express at the IR level. It
also allows the normal optimisations and CodeGen improvements to help
out with atomics, instead of using a limited set of possible
instructions..
rdar://problem/13496295
llvm-svn: 212119
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for a new builtin instruction called
__builtin_ia32_rdpmc.
Builtin '__builtin_ia32_rdpmc' is defined as a 'GCC builtin'; on X86, it can
be used to read performance monitoring counters. It takes as input the index
of the performance counter to read, and returns the value of the specified
performance counter as a 64-bit number.
Calls to this new builtin will map to instruction RDPMC.
The index in input to the builtin call is moved to register %ECX. The result
of the builtin call is the value of the specified performance counter (RDPMC
would return that quantity in registers RDX:RAX).
This patch:
- Adds builtin int_x86_rdpmc as a GCCBuiltin;
- Adds a new x86 DAG node called 'RDPMC_DAG';
- Teaches how to lower this new builtin;
- Adds an ISel pattern to select instruction RDPMC;
- Fixes the definition of instruction RDPMC adding %RAX and %RDX as
implicit definitions, and adding %ECX as implicit use;
- Adds a LLVM test to verify that the new builtin is correctly selected.
llvm-svn: 212049
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to Intel Software Optimization Manual
on Silvermont INC or DEC instructions require
an additional uop to merge the flags.
As a result, a branch instruction depending
on an INC or a DEC instruction incurs a 1 cycle penalty.
Differential Revision: http://reviews.llvm.org/D3990
llvm-svn: 210466
|
|
|
|
|
|
|
|
|
|
| |
Instructions TZCNT (requires BMI1) and LZCNT (requires LZCNT), always
provide the operand size as output if the input operand is zero.
We can take advantage of this knowledge during instruction selection
stage in order to simplify a few corner case.
llvm-svn: 209159
|
|
|
|
|
|
|
|
|
|
|
| |
In AT&T syntax, we should probably print the full "movl" or "movw". TableGen
used to ignore these aliases because it was miscounting the number of operands.
This fixes the issue.
This will be tested when the TableGen "should I print this Alias"
heuristic is fixed (very soon).
llvm-svn: 208963
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, TableGen assumed that every aliased operand consumed precisely 1
MachineInstr slot (this was reasonable because until a couple of days ago,
nothing more complicated was eligible for printing).
This allows a couple more ARM64 aliases to print so we can remove the special
code.
On the X86 side, I've gone for explicit AT&T size specifiers as the default, so
turned off a few of the aliases that would have just started printing.
llvm-svn: 208880
|
|
|
|
|
|
|
| |
To get at least one use of the change (and some actual tests) in with its
commit, I've enabled the AArch64 & ARM64 NEON mov aliases.
llvm-svn: 208867
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch:
- Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and
'int_x86_rdtscp') as GCCBuiltin intrinsics;
- Teaches the backend how to lower the two new builtins;
- Introduces a common function to lower READCYCLECOUNTER dag nodes
and the two new rdtsc/rdtscp intrinsics;
- Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll'
correctly verifies that both READCYCLECOUNTER and the two new intrinsics
work fine for both 64bit and 32bit Subtargets.
llvm-svn: 207127
|
|
|
|
|
|
|
| |
Evidently tablegen doesn't infer this from the HasBMI2 predicate on the BZHI
instructions. This should fix the recent bot failures.
llvm-svn: 206885
|
|
|
|
| |
llvm-svn: 206880
|
|
|
|
|
|
|
|
| |
suggested by Ben Kramer in review of r206738.
Thanks again Ben!
llvm-svn: 206879
|
|
|
|
| |
llvm-svn: 202348
|
|
|
|
| |
llvm-svn: 202347
|
|
|
|
|
|
| |
instructions. Patch by Florian Lukas with some additional instructions fixed by me. Fixes PR18975.
llvm-svn: 202345
|
|
|
|
|
|
| |
32-bit mode counterparts for cases where there is also a OpSize16 instruction.
llvm-svn: 201550
|
|
|
|
|
|
| |
instruction with 0xf2/f3/66 were in front of them, but don't themselves have a prefix. For now this doesn't change any bbehavior, but plan to use it to fix some bugs in the disassembler.
llvm-svn: 201538
|
|
|
|
| |
llvm-svn: 201463
|
|
|
|
|
|
|
|
|
| |
A simple register copy on X86 is just 3 bytes, whereas movabsq is a 10 byte
instruction. Marking movabsq as not beeing cheap will allow LICM to move it
out of the loop and it also prevents unnecessary rematerializations if the
value is needed in more than one register.
llvm-svn: 201377
|
|
|
|
|
|
|
|
|
|
| |
Original commits messages:
Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code.
Simplify a bunch of code by removing the need for the x86 disassembler table builder to know about extended opcodes. The modrm forms are sufficient to convey the information.
llvm-svn: 201065
|
|
|
|
|
|
|
|
| |
r201059 appears to cause a crash in a bootstrapped build of clang. Craig
isn't available to look at it right now, so I'm reverting it while he
investigates.
llvm-svn: 201064
|
|
|
|
|
|
| |
field of modrm byte as a don't care value. Will allow for simplification of disassembler code.
llvm-svn: 201059
|
|
|
|
|
|
| |
instead of DAG combine. This weakens the ability to fold loads with them because we aren't able to match patterns that load the same thing twice. But maybe we should fix that if we care. The peephole optimizer will be able to fold some loads in its absense.
llvm-svn: 200824
|
|
|
|
|
|
| |
meaning no 0x66 prefix in any mode. Rename Opsize16->OpSize32 and OpSize->OpSize16. The classes now refer to their operand size rather than the mode in which they need a 0x66 prefix. Hopefully can merge REX_W into this as OpSize64.
llvm-svn: 200626
|
|
|
|
|
|
| |
in TSFlags.
llvm-svn: 200624
|
|
|
|
|
|
|
|
|
| |
These should end up (in ELF) as R_X86_64_32S relocs, not R_X86_64_32.
Kill the horrid and incomplete special case and FIXME in
EncodeInstruction() and set things up so it can infer the signedness
from the ImmType just like it can the size and whether it's PC-relative.
llvm-svn: 200495
|
|
|
|
| |
llvm-svn: 199808
|
|
|
|
| |
llvm-svn: 199807
|
|
|
|
| |
llvm-svn: 199806
|
|
|
|
| |
llvm-svn: 199805
|
|
|
|
| |
llvm-svn: 199804
|
|
|
|
| |
llvm-svn: 199803
|
|
|
|
|
|
|
|
| |
The disassembler has a special case for 'L' vs. 'W' in its heuristic for
checking for 32-bit and 16-bit equivalents. We could expand the heuristic,
but better just to be consistent in using the 'L' suffix.
llvm-svn: 199652
|
|
|
|
|
|
| |
encoded and disassembled with a segment override prefix. Fixes PR16962.
llvm-svn: 199364
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This finishes the job started in r198756, and creates separate opcodes for
64-bit vs. 32-bit versions of the rest of the RET instructions too.
LRETL/LRETQ are interesting... I can't see any justification for their
existence in the SDM. There should be no 'LRETL' in 64-bit mode, and no
need for a REX.W prefix for LRETQ. But this is what GAS does, and my
Sandybridge CPU and an Opteron 6376 concur when tested as follows:
asm __volatile__("pushq $0x1234\nmovq $0x33,%rax\nsalq $32,%rax\norq $1f,%rax\npushq %rax\nlretl $8\n1:");
asm __volatile__("pushq $1234\npushq $0x33\npushq $1f\nlretq $8\n1:");
asm __volatile__("pushq $0x33\npushq $1f\nlretq\n1:");
asm __volatile__("pushq $0x1234\npushq $0x33\npushq $1f\nlretq $8\n1:");
cf. PR8592 and commit r118903, which added LRETQ. I only added LRETIQ to
match it.
I don't quite understand how the Intel syntax parsing for ret
instructions is working, despite r154468 allegedly fixing it. Aren't the
explicitly sized 'retw', 'retd' and 'retq' supposed to work? I have at
least made the 'lretq' work with (and indeed *require*) the 'q'.
llvm-svn: 199106
|
|
|
|
|
|
|
|
| |
They do *different* things to %esp, so they are not equivalent.
Rename PUSHi8 to PUSH32i8 and add the missing PUSH16i8.
llvm-svn: 198761
|
|
|
|
|
|
|
|
|
|
| |
It seems there is no separate instruction class for having AdSize *and*
OpSize bits set, which is required in order to disambiguate between all
these instructions. So add that to the disassembler.
Hm, perhaps we do need an AdSize16 bit after all?
llvm-svn: 198759
|
|
|
|
|
|
|
|
|
|
| |
I couldn't see how to do this sanely without splitting RETQ from RETL.
Eric says: "sad about the inability to roundtrip them now, but...".
I have no idea what that means, but perhaps it wants preserving in the
commit comment.
llvm-svn: 198756
|
|
|
|
| |
llvm-svn: 198755
|
|
|
|
| |
llvm-svn: 198754
|
|
|
|
| |
llvm-svn: 198753
|
|
|
|
|
|
|
|
|
| |
This fixes the bulk of 16-bit output, and the corresponding test case
x86-16.s now looks mostly like the x86-32.s test case that it was
originally based on. A few irrelevant instructions have been dropped,
and there are still some corner cases to be fixed in subsequent patches.
llvm-svn: 198752
|
|
|
|
|
|
|
|
|
|
|
| |
This is not really expected to work right yet. Mostly because we will
still emit the OpSize (0x66) prefix in all the wrong places, along with
a number of other corner cases. Those will all be fixed in the subsequent
commits.
Patch from David Woodhouse.
llvm-svn: 198584
|
|
|
|
| |
llvm-svn: 198547
|
|
|
|
|
|
| |
instructions to go through to the disassembler tables without resorting to string matches. Apply flag to all _REV instructions.
llvm-svn: 198543
|
|
|
|
|
|
| |
corresponding 32-bit versions with the same encodings Not64BitMode. Remove hack from tablegen disassembler table emitter. Fix bad test.
llvm-svn: 198530
|
|
|
|
| |
llvm-svn: 198336
|
|
|
|
|
|
|
| |
Printing rounding control.
Enncoding for EVEX_RC (rounding control).
llvm-svn: 198277
|
|
|
|
|
|
|
|
|
|
|
| |
That's what it actually means, and with 16-bit support it's going to be
a little more relevant since in a few corner cases we may actually want
to distinguish between 16-bit and 32-bit mode (for example the bare 'push'
aliases to pushw/pushl etc.)
Patch by David Woodhouse
llvm-svn: 197768
|
|
|
|
|
|
|
|
|
| |
Added scalar compare VCMPSS, VCMPSD.
Implemented LowerSELECT for scalar FP operations.
I replaced FSETCCss, FSETCCsd with one node type FSETCCs.
Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1.
llvm-svn: 197384
|