| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
Reverting
Modified MipsABIInfo::classifyArgumentType so that it now coerces
aggregate structures only if the size of said aggregate is less than 16/64
bytes, depending on the ABI.
as it broke clang-with-lto-ubuntu builder.
llvm-svn: 302555
|
| |
|
|
| |
llvm-svn: 302554
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
--This line, and those below, will be igored--
A utils/vscode
A utils/vscode/README
A utils/vscode/tablegen
A utils/vscode/tablegen/.vscode
A utils/vscode/tablegen/.vscode/launch.json
A utils/vscode/tablegen/CHANGELOG.md
A utils/vscode/tablegen/README.md
A utils/vscode/tablegen/language-configuration.json
A utils/vscode/tablegen/package.json
A utils/vscode/tablegen/syntaxes
A utils/vscode/tablegen/syntaxes/TableGen.tmLanguage
A utils/vscode/tablegen/vsc-extension-quickstart.md
llvm-svn: 302553
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way we currently define congruency for two PHIExpression(s) is:
1) The operands to the phi functions are congruent
2) The PHIs are defined in the same BasicBlock.
NewGVN works under the assumption that phi operands are in predecessor
order, or at least in some consistent order. OTOH, is valid IR:
patatino:
%meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ]
%banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ]
br label %end
and the in-memory representations of the two SSA registers have an
inconsistent order. This violation of NewGVN assumptions results into
two PHIs found congruent when they're not. While we think it's useful
to have always a consistent order enforced, let's fix this in NewGVN
sorting uses in predecessor order before creating a PHI expression.
Differential Revision: https://reviews.llvm.org/D32990
llvm-svn: 302552
|
| |
|
|
|
|
|
|
| |
The description says it returns the number of words needed to represent the results. But the way it was coded it always returns (lhsWords + rhsWords) or (lhsWords + rhsWords - 1). But the result could be even smaller than that and it wouldn't tell you.
No one uses the result today so rather than try to fix it, just remove it.
llvm-svn: 302551
|
| |
|
|
| |
llvm-svn: 302550
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds more patterns that a reasonable person might write that can be compiled to BZHI.
This adds support for
(~0U >> (32 - b)) & a;
and
a << (32 - b) >> (32 - b);
This was inspired by the code in APInt::clearUnusedBits.
This can pass an index of 32 to the bzhi instruction which a quick test of Haswell hardware shows will not mask any bits. Though the description text in the Intel manual says the "index is saturated to OperandSize-1". The pseudocode in the same manual indicates no bits will be zeroed for this case.
I think this is still missing cases where the subtract portion is an 8-bit operation.
Differential Revision: https://reviews.llvm.org/D32616
llvm-svn: 302549
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment says to avoid the case where zero bits are shifted into the truncated value,
but the code checks that the shift is smaller than the truncated value instead of the
number of bits added by the sign extension. Fixing this allows a shift by more than the
value size to be introduced, which is undefined behavior, so the shift is capped at the
value size minus one, which has the expected behavior of filling the value with the sign
bit.
Patch by Jacob Young!
Differential Revision: https://reviews.llvm.org/D32285
llvm-svn: 302548
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Modified MipsABIInfo::classifyArgumentType so that it now coerces aggregate
structures only if the size of said aggregate is less than 16/64 bytes,
depending on the ABI.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D32900
llvm-svn: 302547
|
| |
|
|
|
|
|
|
|
| |
for scalar masked instructions only the lower bit of the mask is relevant. so for constant masks we should either do an unmasked operation or no operation, depending on the value of the lower bit.
This patch handles cases where the lower bit is '1'.
Differential Revision: https://reviews.llvm.org/D32805
llvm-svn: 302546
|
| |
|
|
|
|
| |
rdar://32074504
llvm-svn: 302545
|
| |
|
|
|
|
| |
This re-lands r302483. It was not the cause of PR32977.
llvm-svn: 302544
|
| |
|
|
|
|
| |
This re-lands commit r302461. It was not the cause of PR32977.
llvm-svn: 302543
|
| |
|
|
|
|
| |
That test update was for r302469, which was reverted in r302533 due to PR32977.
llvm-svn: 302542
|
| |
|
|
| |
llvm-svn: 302541
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
`__builtin_available`
This commit allows us to use the macOS/iOS/tvOS/watchOS platform names in
`@available`/`__builtin_available`.
rdar://32067795
Differential Revision: https://reviews.llvm.org/D33000
llvm-svn: 302540
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead. NFC.
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.
The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.
Differential Revision: https://reviews.llvm.org/D32762
llvm-svn: 302539
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change optimizes several aspects of the checksum used for chunk headers.
First, there is no point in checking the weak symbol `computeHardwareCRC32`
everytime, it will either be there or not when we start, so check it once
during initialization and set the checksum type accordingly.
Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was
not optimized out, while not necessary. So I reshuffled that part of the code,
which duplicates a tiny bit of code, but ends up in a much cleaner assembly
(and faster as we avoid an extraneous load and some calls).
The following code is the checksum at the end of `scudoMalloc` for x86_64 with
full SSE 4.2, before:
```
mov rax, 0FFFFFFFFFFFFFFh
shl r10, 38h
mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
and r14, rax
lea rsi, [r13-10h]
movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm
or r14, r10
mov rbx, r14
xor bx, bx
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov rsi, rbx
mov edi, eax
call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
mov r14w, ax
mov rax, r13
mov [r13-10h], r14
```
After:
```
mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
lea rcx, [rbx-10h]
mov rdx, 0FFFFFFFFFFFFFFh
and r14, rdx
shl r9, 38h
or r14, r9
crc32 eax, rcx
mov rdx, r14
xor dx, dx
mov eax, eax
crc32 eax, rdx
mov r14w, ax
mov rax, rbx
mov [rbx-10h], r14
```
Reviewers: dvyukov, alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32971
llvm-svn: 302538
|
| |
|
|
| |
llvm-svn: 302537
|
| |
|
|
| |
llvm-svn: 302536
|
| |
|
|
|
|
| |
is not defined, then ARMGenRegisterBank.inc is not table generated and the inclusion of this header causes the build to fail.
llvm-svn: 302535
|
| |
|
|
| |
llvm-svn: 302534
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DISubprogram"
This caused PR32977.
Original commit message:
> Make it illegal for two Functions to point to the same DISubprogram
>
> As recently discussed on llvm-dev [1], this patch makes it illegal for
> two Functions to point to the same DISubprogram and updates
> FunctionCloner to also clone the debug info of a function to conform
> to the new requirement. To simplify the implementation it also factors
> out the creation of inlineAt locations from the Inliner into a
> general-purpose utility in DILocation.
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
> <rdar://problem/31926379>
>
> Differential Revision: https://reviews.llvm.org/D32975
llvm-svn: 302533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In first order recurrence vectorization, when the previous value is a phi node, we need to
set the insertion point to the first non-phi node.
We can have the previous value being a phi node, due to the generation of new
IVs as part of trunc optimization [1].
[1] https://reviews.llvm.org/rL294967
Reviewers: mssimpso, mkuper
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32969
llvm-svn: 302532
|
| |
|
|
|
|
| |
MSVC 2015); NFC.
llvm-svn: 302531
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This should significantly improve darwin lsan performance in cases where root regions are not used.
Reviewers: alekseyshl, kubamracek
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32966
llvm-svn: 302530
|
| |
|
|
|
|
|
|
| |
The modified tests should test the masked intrinsics.
Currently the mask is constant, which with a future patch (https://reviews.llvm.org/D32805) will cause the intrinsics to be replaced with an unmasked version.
This patch changes the constant mask to be a variable one.
llvm-svn: 302529
|
| |
|
|
|
|
|
|
| |
This is a bit easier to read and a lot faster in some cases. A version
of many-sections.s with linker scripts goes from 0m41.232s to
0m19.575s.
llvm-svn: 302528
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.
This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.
The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
affects all targets that use frame pseudo instructions and touched many
files although the changes are uniform.
- Access to frame properties are implemented using special instructions
rather than calls getOperand(N).getImm(). For X86 and ARM such
replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
instruction. These involve proper instruction initialization and
methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
frame parts initialized inside frame instruction pair and outside it.
The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.
Differential Revision: https://reviews.llvm.org/D32394
llvm-svn: 302527
|
| |
|
|
|
|
| |
This reverts commit rL302512. This broke the mips buildbots.
llvm-svn: 302526
|
| |
|
|
|
|
|
|
|
|
| |
Similar to what we do for vXi8 ASHR(X, 7), use SSE42's PCMPGTQ to splat the sign instead of using the PSRAD+PSHUFD.
Avoiding bitcasts this improves combines that utilize computeNumSignBits, permits memory folding and reduces pipe pressure. Although it does require a second register, given that this is a (cheap) zero register the impact is minimal.
Differential Revision: https://reviews.llvm.org/D32973
llvm-svn: 302525
|
| |
|
|
|
|
| |
This reverts commit r302520 because it break the unit tests.
llvm-svn: 302524
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
TypeScript uses triple slash directives of the form:
/// <reference path="..."/>
For various non-source instructions that should not be wrapped.
Reference:
https://www.typescriptlang.org/docs/handbook/triple-slash-directives.html
Reviewers: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D32997
llvm-svn: 302523
|
| |
|
|
| |
llvm-svn: 302522
|
| |
|
|
|
|
|
|
| |
https://reviews.llvm.org/D31867
Patch by Johannes Altmanninger!
llvm-svn: 302521
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no other explanation about why this only started happening
now, even though it crashes on old code (supposedly reachable from
here).
The only common factor between the failing bots is that they use GCC
(4.9 and 5.3) to compile Clang, while the others use Clang 3.8, but the
failure is while building the tests, as an assertion, on Clang.
Commenting it out for now in hope the bots will go back green, but we
should keep looking for the real cause, and update bugzilla.
llvm-svn: 302520
|
| |
|
|
| |
llvm-svn: 302519
|
| |
|
|
|
|
|
| |
This patch reinstates r299930, reverted in r299956, as a separate diagnostic
option (-Wunused-template).
llvm-svn: 302518
|
| |
|
|
|
|
| |
Adapt to changes made in r302499.
llvm-svn: 302517
|
| |
|
|
|
|
| |
Adapt to changes made in r302499.
llvm-svn: 302516
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases).
Reviewers: grosser, Meinersbur, bollu
Reviewed By: bollu
Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D32961
llvm-svn: 302515
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
- This change allows targets to opt-in to using them instead of the log2
shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
factored out into LoopUtils, and now have a unified interface for generating
reductions regardless of the preference of the target. LoopUtils now uses TTI
to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.
Differential Revision: https://reviews.llvm.org/D30086
llvm-svn: 302514
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: Eugene.Zelenko, dschuff, craig.topper
Reviewed By: craig.topper
Subscribers: ahatanak, aaboud, DavidKreitzer, llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D32543
Patch by AndreiGrischenko <andrei.l.grischenko@intel.com>
llvm-svn: 302513
|
| |
|
|
|
|
|
|
|
| |
This patch adds support for recognizing patterns to match
DINS instruction.
Differential Revision: https://reviews.llvm.org/D31465
llvm-svn: 302512
|
| |
|
|
|
|
|
| |
Remove the code selecting G_FADD - now that TableGen can handle more
opcodes, it's not needed anymore.
llvm-svn: 302511
|
| |
|
|
|
|
| |
avoid creating the min APInt until we're sure we need it. Use inplace shift operations.
llvm-svn: 302510
|
| |
|
|
| |
llvm-svn: 302509
|
| |
|
|
|
|
| |
compiler should be able to generate slightly better code for the former. NFC
llvm-svn: 302508
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FunctionDecl when instantiating the exception specification.
This fixes the bug: https://bugs.llvm.org/show_bug.cgi?id=32638
int main()
{
[](auto x) noexcept(noexcept(x)) { } (0);
}
In the above code, prior to this patch, when substituting into the noexcept expression, i.e. transforming the DeclRefExpr that represents 'x' - clang attempts to capture 'x' because Sema's CurContext is still pointing to the pattern FunctionDecl (i.e. the templated-decl set in FinishTemplateArgumentDeduction) which does not match the substituted 'x's DeclContext, which leads to an attempt to capture and an assertion failure.
We fix this by adjusting Sema's CurContext to point to the substituted FunctionDecl under which the noexcept specifier's argument should be transformed, and so the ParmVarDecl that 'x' refers to has the same declcontext and no capture is attempted.
I briefly investigated whether the SwitchContext should occur right after VisitMethodDecl creates the new substituted FunctionDecl, instead of only during instantiating the exception specification - but seeing no other code that seemed to rely on that, I decided to leave it just for the duration of the exception specification instantiation.
llvm-svn: 302507
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were sometimes doing a function->pointer conversion in
Sema::CheckPlaceholderExpr, which isn't the job of CheckPlaceholderExpr.
So, when we saw typeof(OverloadedFunctionName), where
OverloadedFunctionName referenced a name with only one function that
could have its address taken, we'd give back a function pointer type
instead of a function type. This is incorrect.
I kept the logic for doing the function pointer conversion in
resolveAndFixAddressOfOnlyViableOverloadCandidate because it was more
consistent with existing ResolveAndFix* methods.
llvm-svn: 302506
|