| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 272416
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Adds a MachineFunctionPass that scans the body to find calls, and
update the register mask with the one saved by the
RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D21180
llvm-svn: 272414
|
| |
|
|
| |
llvm-svn: 272411
|
| |
|
|
|
|
|
| |
Split up the test cases into two inputs as per post-commit review comments from
Renato. NFC.
llvm-svn: 272408
|
| |
|
|
|
|
|
|
|
| |
The costs are somewhat hand-wavy, but should be much closer to the truth
than what we get from BasicTTI.
Differential Revision: http://reviews.llvm.org/D21156
llvm-svn: 272406
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an option to enable the analysis of MachineFunction register
usage to extract the list of clobbered registers.
When enabled, the CodeGen order is changed to be bottom up on the Call
Graph.
The analysis is split in two parts, RegUsageInfoCollector is the
MachineFunction Pass that runs post-RA and collect the list of
clobbered registers to produce a register mask.
An immutable pass, RegisterUsageInfo, stores the RegMask produced by
RegUsageInfoCollector, and keep them available. A future tranformation
pass will use this information to update every call-sites after
instruction selection.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D20769
llvm-svn: 272403
|
| |
|
|
| |
llvm-svn: 272398
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Somehow, the codegen logic for these sequences has gone completely untested
until now (note the 2 compare instructions generated per test).
There's also an *Intel* AVX optimization opportunity exposed in these cases
and the existing tests. Intel's (but not AMD's) AVX spec shows that extra FP
predicates were added, so a single comparison should always be sufficient,
and operand commutation should never be necessary.
llvm-svn: 272397
|
| |
|
|
| |
llvm-svn: 272396
|
| |
|
|
|
|
|
|
| |
This reapplies commit r272385 with a fix. The build was failing when compiled
with gcc, but not with clang. With the fix, we now get the data layout from the
current TTI implementation, which will hopefully solve the issue.
llvm-svn: 272395
|
| |
|
|
|
|
| |
(PSLLDQ/PSRLDQ/PALIGNR)
llvm-svn: 272392
|
| |
|
|
|
|
|
| |
This reverts commit r272385. This commit broke the build. I'm temporarily
reverting to investigate.
llvm-svn: 272391
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch refines the default cost for interleaved load groups having gaps. If
a load group has gaps, the legalized instructions corresponding to the unused
elements will be dead. Thus, we don't need to account for them in the cost
model. Instead, we only need to account for the fraction of legalized loads
that will actually be used.
Differential Revision: http://reviews.llvm.org/D20873
llvm-svn: 272385
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in AMDGPUOperand.
Summary:
sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported.
Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier.
Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers.
Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...).
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20968
llvm-svn: 272384
|
| |
|
|
|
|
|
|
|
|
| |
Memory operand is new for AVX512 (SSE/AVX2 didn't support it).
Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations).
Regenerated VPALIGNR test now that the shuffle comments work
llvm-svn: 272383
|
| |
|
|
| |
llvm-svn: 272371
|
| |
|
|
|
|
| |
shuffles. Previously we were printing the mask operands as the register names.
llvm-svn: 272367
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds the struct field offset array in struct StructInfo.
Updates test struct_field_count_basic.ll.
Reviewers: aizatsky
Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka
Differential Revision: http://reviews.llvm.org/D21192
llvm-svn: 272362
|
| |
|
|
|
|
|
|
|
|
| |
The test case is not great espicially because it is still cumbersome to
run the regalloc pass with run-pass. (We miss a bunch of initiliazier to
be properly implemented.)
Related to llvm.org/PR27983
llvm-svn: 272360
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we could run only one machine pass with the run-pass option.
With that patch, we can now specify several passes with several run-pass
options (or just one option with a list of comma separated passes) and
llc will build the related pipeline.
This is great to test the interaction of two passes that are not
necessarily next to each other in the pipeline, or play with pass
ordering.
Now, we should be at parity with opt for the flexibility of running
passes.
Note: I also moved the run pass option from CommandFlags.h to llc.cpp
because, really, this is needed only there!
llvm-svn: 272356
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds ClInstrumentFastpath option to control fastpath instrumentation.
Avoids the load/store instrumentation for the cache fragmentation tool.
Renames cache_frag_basic.ll to working_set_slow.ll for slowpath
instrumentation test.
Adds the __esan_init check in struct_field_count_basic.ll.
Reviewers: aizatsky
Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka
Differential Revision: http://reviews.llvm.org/D21079
llvm-svn: 272355
|
| |
|
|
|
|
| |
Fixes verifier errors after SIShrinkInstructions.
llvm-svn: 272351
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.
Reviewers: scchan, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19994
llvm-svn: 272349
|
| |
|
|
|
|
|
|
| |
The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.
llvm-svn: 272345
|
| |
|
|
|
|
|
|
|
|
| |
This was using extract_subreg sub0 to extract the low register
of the result instead of sub0_sub1, producing an invalid copy.
There doesn't seem to be a way to use the compound subreg indices
in tablegen since those are generated, so manually select it.
llvm-svn: 272344
|
| |
|
|
| |
llvm-svn: 272343
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We failed to unpoison uninteresting allocas on return as unpoisoning is part of
main instrumentation which skips such allocas.
Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of
dynamic allocas is disabled it will not will not be unpoisoned.
PR27453
Reviewers: kcc, eugenis
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21207
llvm-svn: 272341
|
| |
|
|
|
|
|
| |
Update a test as we're now going to emit it for easier reading of
generated assembly as well.
llvm-svn: 272339
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch, we used argument/global stratified attributes in
order to note that a value could have come from either dereferencing a
global/arg, or from the assignment from a global/arg.
Now, AttrUnknown is placed on sets when we see a dereference, instead of
the global/arg attributes. This allows us to be more aggressive in the
future when we see global/arg attributes without AttrUnknown.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21110
llvm-svn: 272335
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We still want to unpoison full stack even in use-after-return as it can be disabled at runtime.
PR27453
Reviewers: eugenis, kcc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21202
llvm-svn: 272334
|
| |
|
|
|
|
|
|
| |
Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo.
Differential revision: http://reviews.llvm.org/D21045
llvm-svn: 272321
|
| |
|
|
| |
llvm-svn: 272319
|
| |
|
|
|
|
| |
Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ
llvm-svn: 272308
|
| |
|
|
| |
llvm-svn: 272307
|
| |
|
|
|
|
|
|
| |
shifts
512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget
llvm-svn: 272300
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently clang emits these instructions via inline (volatile) asm in
the CUDA headers. Switching to intrinsics will let the optimizer reason
across calls to these intrinsics.
Reviewers: tra
Subscribers: llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D21160
llvm-svn: 272298
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21090
llvm-svn: 272294
|
| |
|
|
|
|
|
|
|
| |
We were using the fast fdiv lowering for all division, implementation of
IEEE754 fdiv is added.
http://reviews.llvm.org/D20557
llvm-svn: 272292
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we materialized secondary vector IVs from the primary scalar IV,
by offseting the primary to match the correct start value, and then broadcasting
it - inside the loop body. Instead, we can use a real vector IV, like we do for
the primary.
This enables using vector IVs for secondary integer IVs whose type matches the
type of the primary.
Differential Revision: http://reviews.llvm.org/D20932
llvm-svn: 272283
|
| |
|
|
| |
llvm-svn: 272278
|
| |
|
|
|
|
| |
Thanks to Rafael for the suggestion.
llvm-svn: 272277
|
| |
|
|
|
|
|
|
|
| |
Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG.
Reviewers: arsenm
Differential Revision: http://reviews.llvm.org/D17898
llvm-svn: 272272
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reapplies commit r271930, r271915, r271923. They hit a bug in
Thumb which is fixed in r272258 now.
The original message:
The code layout that TailMerging (inside BranchFolding) works on is not the
final layout optimized based on the branch probability. Generally, after
BlockPlacement, many new merging opportunities emerge.
This patch calls Tail Merging after MBP and calls MBP again if Tail Merging
merges anything.
llvm-svn: 272267
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables use of the 'S' constraint for inline ASM operands on
SystemZ, which allows for a memory reference with a signed 20-bit
immediate displacement. This patch includes corresponding documentation
and test case updates.
I've changed the 'T' constraint to match the new behavior for 'S', as
'T' also uses a long displacement (though index constraints are still
not implemented). I also changed 'm' to match the behavior for 'S' as
this will allow for a wider range of displacements for 'm', though
correct me if that's not the right decision.
Author: colpell
Differential Revision: http://reviews.llvm.org/D21097
llvm-svn: 272266
|
| |
|
|
| |
llvm-svn: 272264
|
| |
|
|
|
|
|
| |
Fixes a crash in the backend during an LTO build of rtld(1) in
FreeBSD.
llvm-svn: 272262
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D11798
llvm-svn: 272259
|
| |
|
|
|
|
|
|
| |
This is complement patch to D21060.
Differential Revision: http://reviews.llvm.org/D21174
llvm-svn: 272257
|
| |
|
|
|
|
|
|
| |
SELNEZ.* and CMP.condn.fmt instructions
Differential Revision: http://reviews.llvm.org/D20862
llvm-svn: 272256
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D21129
llvm-svn: 272255
|