| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Current runtime unrolling invalidates parent loop saying that it might have changed
after the inner loop has changed, but it doesn't bother to do the same to its parents.
With patch rL329047, SCEV becomes much smarter about calculation of exit counts for
outer loops. We might need to invalidate not only the immediate parent, but also
any of its parents as well.
There is no clear evidence that there is some miscompile happening because of this
(at least I don't have such test), but the common sense says that the current code
is wrong.
Differential Revision: https://reviews.llvm.org/D45940
Reviewed By: chandlerc
llvm-svn: 330577
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
In the function `simplifyOneLoop` we optimistically assume that changes in the
inner loop only affect this very loop and have no impact on its parents. In fact,
after rL329047 has been merged, we can now calculate exit counts for outer
loops which may depend on inner loops. Thus, we need to invalidate all parents
when we do something to a loop.
There is an evidence of incorrect behavior of `simplifyOneLoop`: when we insert
`SE->verify()` check in the end of this funciton, it fails on a bunch of existing
test, in particular:
    LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll
    LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll
    LLVM :: Transforms/LoopUnroll/peel-loop.ll
    LLVM :: Transforms/LoopUnroll/peel-loop2.ll
Note that previously we have fixed issues of this variety, see rL328483.
This patch makes this function invalidate the outermost loop properly.
Differential Revision: https://reviews.llvm.org/D45937
Reviewed By: chandlerc
llvm-svn: 330576
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
store instructions.
Reviewers: fhahn, rengolin, javed.absar, SjoerdMeijer, t.p.northover, echristo, evandro, huntergr
Reviewed By: rengolin
Subscribers: tschuett, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D45681
llvm-svn: 330565
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
The condition this was asserting doesn't actually hold. I've added
comments to explain why, removed the assert, and added a fun test case
reduced out of 403.gcc.
llvm-svn: 330564
 | 
| | 
| 
| 
|  | 
llvm-svn: 330563
 | 
| | 
| 
| 
|  | 
llvm-svn: 330560
 | 
| | 
| 
| 
|  | 
llvm-svn: 330558
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: Wrap LLVMDIBuilderCreateAutoVariable, LLVMDIBuilderCreateParameterVariable, LLVMDIBuilderCreateExpression, and move and correct LLVMDIBuilderInsertDeclareBefore and LLVMDIBuilderInsertDeclareAtEnd from the Go bindings to the C bindings.
Reviewers: harlanhaskins, whitequark, deadalnix
Reviewed By: harlanhaskins, whitequark
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45928
llvm-svn: 330555
 | 
| | 
| 
| 
| 
| 
|  | 
Fixed a lot of the default classes which were being completely overridden.
llvm-svn: 330554
 | 
| | 
| 
| 
|  | 
llvm-svn: 330553
 | 
| | 
| 
| 
|  | 
llvm-svn: 330552
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This is the last step in getting constant pattern matchers to allow
undef elements in constant vectors.
I'm adding a dedicated m_ZeroInt() function and building m_Zero() from
that. In most cases, calling code can be updated to use m_ZeroInt()
directly when there's no need to match pointers, but I'm leaving that
efficiency optimization as a follow-up step because it's not always
clear when that's ok.
There are just enough icmp folds in InstSimplify that can be used for 
integer or pointer types, that we probably still want a generic m_Zero()
for those cases. Otherwise, we could eliminate it (and possibly add a
m_NullPtr() as an alias for isa<ConstantPointerNull>()).
We're conservatively returning a full zero vector (zeroinitializer) in
InstSimplify/InstCombine on some of these folds (see diffs in InstSimplify),
but I'm not sure if that's actually necessary in all cases. We may be 
able to propagate an undef lane instead. One test where this happens is 
marked with 'TODO'.
 
llvm-svn: 330550
 | 
| | 
| 
| 
|  | 
llvm-svn: 330549
 | 
| | 
| 
| 
|  | 
llvm-svn: 330548
 | 
| | 
| 
| 
| 
| 
|  | 
to remove unnecessary instrw overrides.
llvm-svn: 330546
 | 
| | 
| 
| 
|  | 
llvm-svn: 330545
 | 
| | 
| 
| 
| 
| 
|  | 
This also fixes some of the ReadAfterLd issues due to InstRW.
llvm-svn: 330544
 | 
| | 
| 
| 
| 
| 
|  | 
instrw overrides.
llvm-svn: 330542
 | 
| | 
| 
| 
| 
| 
|  | 
latency.
llvm-svn: 330541
 | 
| | 
| 
| 
| 
| 
| 
|  | 
When a prefix is passed, we need to print a colon a space after it, not
just the prefix.
llvm-svn: 330535
 | 
| | 
| 
| 
| 
| 
|  | 
This matches the other FENCE instructions.
llvm-svn: 330533
 | 
| | 
| 
| 
| 
| 
|  | 
OpSizeFixed.
llvm-svn: 330532
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
'data32' in 16-bit mode. Hack the asm parser to convert 'data32' to 'data16' in 16-bit mode.
Improve the error messages to match GNU assembler.
This also allows us to remove the hack from the disassembler table building.
llvm-svn: 330531
 | 
| | 
| 
| 
| 
| 
|  | 
scheduler models.
llvm-svn: 330527
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Several tools prefix the error/warning/note output with the name of the
tool. One such tool is LLD for example. This commit adds as an optional
'Prefix' argument to the convenience helpers.
llvm-svn: 330526
 | 
| | 
| 
| 
|  | 
llvm-svn: 330525
 | 
| | 
| 
| 
| 
| 
|  | 
models.
llvm-svn: 330523
 | 
| | 
| 
| 
| 
| 
|  | 
VPERM2I128/VINSERTI128
llvm-svn: 330522
 | 
| | 
| 
| 
| 
| 
|  | 
pack/unpack instruction instrw overrides from scheduler models.
llvm-svn: 330521
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
PMULHW/PMULHUW.
Ultimately I want to use this to remove the intrinsics for these instructions.
llvm-svn: 330520
 | 
| | 
| 
| 
|  | 
llvm-svn: 330517
 | 
| | 
| 
| 
| 
| 
|  | 
overrides.
llvm-svn: 330514
 | 
| | 
| 
| 
| 
| 
|  | 
D45629
llvm-svn: 330513
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
from scheduler models.
The required the default skylake schedules to be updated - these were being completely overriden by the InstRW and the existing values not used at all.
llvm-svn: 330510
 | 
| | 
| 
| 
| 
| 
|  | 
scheduler models.
llvm-svn: 330508
 | 
| | 
| 
| 
|  | 
llvm-svn: 330505
 | 
| | 
| 
| 
|  | 
llvm-svn: 330503
 | 
| | 
| 
| 
|  | 
llvm-svn: 330501
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Vectorized loops with abs() returns incorrect results on POWER9. This patch fixes it.
For example the following code returns negative result if input values are negative though it sums up the absolute value of the inputs.
int vpx_satd_c(const int16_t *coeff, int length) {
  int satd = 0;
  for (int i = 0; i < length; ++i) satd += abs(coeff[i]);
  return satd;
}
This problem causes test failures for libvpx.
For vector absolute and vector absolute difference on POWER9, LLVM generates VABSDUW (Vector Absolute Difference Unsigned Word) instruction or variants.
Since these instructions are for unsigned integers, we need adjustment for signed integers.
For abs(sub(a, b)), we generate VABSDUW(a+0x80000000, b+0x80000000). Otherwise, abs(sub(-1, 0)) returns 0xFFFFFFFF(=-1) instead of 1. For abs(a), we generate VABSDUW(a+0x80000000, 0x80000000).
Differential Revision: https://reviews.llvm.org/D45522
llvm-svn: 330497
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
In certain cases, the compiler might try to merge __stack_chk_guard with
another global variable.  (Or someone could theoretically define
__stack_chk_guard as an alias.)  In that case, make sure we don't crash.
Differential Revision: https://reviews.llvm.org/D45746
llvm-svn: 330495
 | 
| | 
| 
| 
|  | 
llvm-svn: 330489
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
When creating a call to storeStrong in ObjCARCContract, ensure the call
gets the correct funclet token, otherwise WinEHPrepare will turn the
call (and all subsequent instructions) into unreachable.
We already have logic to do this for the ARC autorelease elision marker;
factor that out into a common function that's used for both. These are
the only two places in this transform that create call instructions.
Differential Revision: https://reviews.llvm.org/D45857
llvm-svn: 330487
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes.
This unearthed a couple of things that are also handled in this patch:
(1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic
(2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015.
(3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops.
Differential Revision: https://reviews.llvm.org/D45629
llvm-svn: 330480
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
Support the dynamic shadow memory offset (the default case for user
space now) and static non-zero shadow memory offset
(-hwasan-mapping-offset option). Keeping the the latter case around
for functionality and performance comparison tests (and mostly for
-hwasan-mapping-offset=0 case).
The implementation is stripped down ASan one, picking only the relevant
parts in the following assumptions: shadow scale is fixed, the shadow
memory is dynamic, it is accessed via ifunc global, shadow memory address
rematerialization is suppressed.
Keep zero-based shadow memory for kernel (-hwasan-kernel option) and
calls instreumented case (-hwasan-instrument-with-calls option), which
essentially means that the generated code is not changed in these cases.
Reviewers: eugenis
Subscribers: srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D45840
llvm-svn: 330475
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The callback used to create an ORE for the legacy PI pass caches the allocated
object in a unique_ptr in the runOnModule function, and returns a reference to
that object. Under certian circumstances we can end up holding onto that
reference after the OREs destruction. Rather then allowing the new and legacy
passes to create ORE object in diffrent ways, create the ORE at the point of
use.
Differential Revision: https://reviews.llvm.org/D43219
llvm-svn: 330473
 | 
| | 
| 
| 
|  | 
llvm-svn: 330472
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
There was some unfortunate interaction between VSPLAT and BITCAST
related to the selection of constant vectors (coming from selecting
shuffles). Introduce VSPLATW that always splats a 32-bit word, and
can have arbitrary result type (to avoid BITCASTs of VSPLAT).
Clean up the previous selection of BITCAST/VSPLAT.
llvm-svn: 330471
 | 
| | 
| 
| 
| 
| 
|  | 
NFCI.
llvm-svn: 330470
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Fixed slots have negative values, and TRI::stackSlot2Index and
TRI::index2StackSlot do not handle negative numbers.
llvm-svn: 330468
 | 
| | 
| 
| 
|  | 
llvm-svn: 330465
 |