| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Defs are seen before uses, so a def without the kill flag doesn't necessarily
mean that the register is not killed on that instruction. It may be killed
in a later use operand.
llvm-svn: 217689
|
|
|
|
|
|
| |
Thanks to David Blakie for the in-depth review!
llvm-svn: 217682
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D5046
llvm-svn: 217681
|
|
|
|
| |
llvm-svn: 217680
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D5004
llvm-svn: 217678
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D5003
llvm-svn: 217676
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D5211
llvm-svn: 217675
|
|
|
|
|
|
| |
Cross-class copies being expensive is actually a trait of the microarchitecture, but as I haven't yet seen an example of a microarchitecture where they're cheap it seems best to just enable this by default, covering the non-mcpu build case.
llvm-svn: 217674
|
|
|
|
| |
llvm-svn: 217669
|
|
|
|
|
|
| |
make_unique.
llvm-svn: 217655
|
|
|
|
|
|
|
| |
The register numbers start at 0, so if only 1 register
was used, this was reported as 0.
llvm-svn: 217636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inline asm may specify 'U' and 'X' constraints to print a 'u' for an
update-form memory reference, or an 'x' for an indexed-form memory
reference. However, these are really only useful in GCC internal code
generation. In inline asm the operand of the memory constraint is
typically just a register containing the address, so 'U' and 'X' make
no sense.
This patch quietly accepts 'U' and 'X' in inline asm patterns, but
otherwise does nothing. If we ever unexpectedly see a non-register,
we'll assert and sort it out afterwards.
I've added a new test for these constraints; the test case should be
used for other asm-constraints changes down the road.
llvm-svn: 217622
|
|
|
|
| |
llvm-svn: 217611
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r189189 implemented AVX512 unpack by essentially performing a 256-bit unpack
between the low and the high 256 bits of src1 into the low part of the
destination and another unpack of the low and high 256 bits of src2 into the
high part of the destination.
I don't think that's how unpack works. AVX512 unpack simply has more 128-bit
lanes but other than it works the same way as AVX. So in each 128-bit lane,
we're always interleaving certain parts of both operands rather different
parts of one of the operands.
E.g. for this:
__v16sf a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
__v16sf b = { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 };
__v16sf c = __builtin_shufflevector(a, b, 0, 8, 1, 9, 4, 12, 5, 13, 16,
24, 17, 25, 20, 28, 21, 29);
we generated punpcklps (notice how the elements of a and b are not interleaved
in the shuffle). In turn, c was set to this:
0 16 1 17 4 20 5 21 8 24 9 25 12 28 13 29
Obviously this should have just returned the mask vector of the shuffle
vector.
I mostly reverted this change and made sure the original AVX code worked
for 512-bit vectors as well.
Also updated the tests because they matched the logic from the code.
llvm-svn: 217602
|
|
|
|
| |
llvm-svn: 217600
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactored the R600_LDS_1A2D class a bit to get it to actually work.
It seemed to be previously unused and broken.
We also have to disable the conversion to the noret variant for now in
R600ISelLowering because the getLDSNoRetOp method only handles 1A1D LDS ops.
Someone can feel free to modify the AMDGPU::getLDSNoRetOp method to
work for more than 1A1D variants of LDS operations. It's being left as a
future TODO for now.
Signed-off-by: Aaron Watry <awatry at gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217596
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217594
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217593
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217592
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217591
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217590
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was only present for SI before.
Cayman may still be missing, but I am unable to test that currently.
v2: Don't create atomicrmw max tests in separate file
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217589
|
|
|
|
|
|
|
| |
The lost chain resulting in earlier side effecting nodes
being deleted.
llvm-svn: 217561
|
|
|
|
|
|
|
|
|
|
|
| |
Need to convert the 64 element offset into bytes, not just the element
size like the normal case instructions.
Noticed by inspection. This can't be hit now because
st64 instructions aren't emitted during instruction selection,
and the post-RA scheduler isn't enabled.
llvm-svn: 217560
|
|
|
|
| |
llvm-svn: 217553
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this a DataLayoutPass can be reused for multiple modules.
Once we have doInitialization/doFinalization, it doesn't seem necessary to pass
a Module to the constructor.
Overall this change seems in line with the idea of making DataLayout a required
part of Module. With it the only way of having a DataLayout used is to add it
to the Module.
llvm-svn: 217548
|
|
|
|
|
|
|
|
|
|
| |
The increase of the interleave factor to 4 has side-effects
like performance losses eg. due to reminder loops being executed
more frequently and may increase code size. It requires more
analysis and careful heuristic tuning. Expect double digit gains
in small benchmarks like lowercase.c and losses in puzzle.c.
llvm-svn: 217540
|
|
|
|
|
|
|
|
|
|
|
| |
names controlling this variable.
"Unroll" is not the appropriate name for this variable. Clang already uses
the term "interleave" in pragmas and metadata for this.
Differential Revision: http://reviews.llvm.org/D5066
llvm-svn: 217528
|
|
|
|
|
|
| |
experimental support)
llvm-svn: 217518
|
|
|
|
|
|
| |
release build.
llvm-svn: 217505
|
|
|
|
|
|
|
|
|
|
| |
This adds target specific support for using the PBQP register allocator on the
AArch64, for the A57 cpu.
By default, the PBQP allocator is not used, unless explicitely required
on the command line with "-aarch64-pbqp".
llvm-svn: 217504
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
using static relocation model and small code model.
Summary: currently we generate GOT based relocations for weak symbol
references regardless of the underlying relocation model. This should
be change so that in static relocation model we use a constant pool
load instead.
Patch from: Keith Walker
Reviewers: Renato Golin, Tim Northover
llvm-svn: 217503
|
|
|
|
|
|
|
| |
This change gives tblgen the information needed to fill in the
HexagonRegEncodingTable.
llvm-svn: 217500
|
|
|
|
|
|
|
|
|
| |
The only Thumb-1 multi-store capable of using LR is the PUSH instruction, which
translates to STMDB, so we shouldn't convert STMIAs.
Patch by Sergey Dmitrouk.
llvm-svn: 217498
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MipsCallingConv.td
Summary: No functional change
Reviewers: echristo, vmedic
Reviewed By: echristo, vmedic
Subscribers: echristo, llvm-commits
Differential Revision: http://reviews.llvm.org/D5266
llvm-svn: 217494
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MipsCC::numIntArgRegs()
Summary: No functional change.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5265
llvm-svn: 217485
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instrumentation code.
Summary: [asan-assembly-instrumentation] Added CFI directives to the generated instrumentation code.
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5189
llvm-svn: 217482
|
|
|
|
|
|
|
| |
This ensures the inline assembly register constraints are properly recognised in
TargetLowering::getRegForInlineAsmConstraint.
llvm-svn: 217479
|
|
|
|
|
|
|
|
|
| |
This commit adds aliases for the sync instruction (synciobdma,
syncs, syncw, syncws) which are used by the Octeon CPU.
Reviewed by D. Sanders
llvm-svn: 217477
|
|
|
|
| |
llvm-svn: 217473
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a first pass at a scheduling model for Jaguar.
It's structured largely on the existing SandyBridge and SLM sched models.
Using this model, in addition to turning on the PostRA scheduler, results in
some perf wins on internal and 3rd party benchmarks. There's not much difference
in LLVM's test-suite benchmarking subset of tests.
Differential Revision: http://reviews.llvm.org/D5229
llvm-svn: 217457
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This directive is used to reset the assembler options to their initial values.
Assembly programmers use it in conjunction with the ".set mipsX" directives.
This patch depends on the .set push/pop directive (http://reviews.llvm.org/D4821).
Contains work done by Matheus Almeida.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D4957
llvm-svn: 217438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MipsSubtarget::getGPRSizeInBytes()
Summary:
The GPR size is more a property of the subtarget than that of the ABI so move
this information to the MipsSubtarget.
No functional change.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5009
llvm-svn: 217436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In AT&T annotation for both x86_64 and x32 calls should be printed as
callq in assembly. It's only a matter of correct mnemonic, object output
is ok.
Test Plan: trivial test added
Reviewers: nadav, dschuff, craig.topper
Subscribers: llvm-commits, zinovy.nis
Differential Revision: http://reviews.llvm.org/D5213
llvm-svn: 217435
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Use a MipsSubtarget reference instead.
No functional change.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5008
llvm-svn: 217434
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These directives are used to save the current assembler options (in the case of ".set push") and restore the previously saved options (in the case of ".set pop").
Contains work done by Matheus Almeida.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D4821
llvm-svn: 217432
|
|
|
|
|
|
|
|
| |
This patch is to permit a negative offset usage for a non frame access.
Patch by Igor Oblakov.
llvm-svn: 217431
|
|
|
|
|
|
|
|
| |
When compiling without SSE2, isTruncStoreLegal(F64, F32) would return Legal, whereas with SSE2 it would return Expand. And since the Target doesn't seem to actually handle a truncstore for double -> float, it would just output a store of a full double in the space for a float hence overwriting other bits on the stack.
Patch by Luqman Aden!
llvm-svn: 217410
|
|
|
|
| |
llvm-svn: 217381
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assert in scheduler from an inserted copy_to_regclass from
a constant.
This only seems to break sometimes when a constant initializer
address is forced into VGPRs in a non-entry block. No test
since the only case I've managed to hit only happens with a future
patch, and that case will also not be a problem once scalar instructions
are used in non-entry blocks.
llvm-svn: 217380
|