| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gcc supports target armv7ve which is armv7-a with virtualization
extensions. This change adds support for this in llvm for gcc
compatibility.
Also remove redundant FeatureHWDiv, FeatureHWDivARM for a few models as
this is specified automatically by FeatureVirtualization.
Patch by Manoj Gupta.
Differential Revision: https://reviews.llvm.org/D29472
llvm-svn: 294661
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28690
llvm-svn: 294636
|
| |
|
|
| |
llvm-svn: 294635
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This requires that we communicate to X86InstrInfo::optimizeCompareInstr
that the second operand is neither a register nor an immediate. The way we
do that is by setting CmpMask to zero.
Note that there were already instructions where the second operand was not a
register nor an immediate, namely X86::SUB*rm, so also set CmpMask to zero
for those instructions. This seems like a latent bug, but I was unable to
trigger it.
Differential Revision: https://reviews.llvm.org/D28621
llvm-svn: 294634
|
| |
|
|
|
|
|
|
| |
using switch statement
Differential Revision: https://reviews.llvm.org/D29741
llvm-svn: 294627
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by
Mikael Holmen. Handle non-canonicalized xor not operation
correctly (was assuming operand 0 was always the non-constant operand)
and check that the negated condition is also in the same block as the
original and/or instruction (as is done for and/or operands already)
before proceeding with optimization.
Reviewers: bogner, MatzeB, qcolombet
Subscribers: mcrosier, uabelho, llvm-commits
Differential Revision: https://reviews.llvm.org/D29680
llvm-svn: 294605
|
| |
|
|
| |
llvm-svn: 294598
|
| |
|
|
|
|
|
|
| |
protection was applied to a function"
this reverts revision r294590 as it broke some buildbots.
llvm-svn: 294593
|
| |
|
|
|
|
| |
If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear.
llvm-svn: 294591
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
applied to a function
Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function.
This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled.
Patch by: James Henderson
Differential Revision: https://reviews.llvm.org/D29023
llvm-svn: 294590
|
| |
|
|
|
|
|
|
|
|
| |
math.
In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants.
Differential Revision: https://reviews.llvm.org/D29756
llvm-svn: 294588
|
| |
|
|
|
|
| |
Test for expected codegen for nr reciprocal cases with/without FMA
llvm-svn: 294587
|
| |
|
|
|
|
|
| |
Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets
because we don't support libcalls yet.
llvm-svn: 294584
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable folding patterns which load the value from non-zero offset:
i8 *a = ...
i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24)
=>
i32 val = *((i32*)(a+4))
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D29394
llvm-svn: 294582
|
| |
|
|
|
|
|
|
|
|
|
|
| |
LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register.
This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD.
On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.)
Differential Revision: https://reviews.llvm.org/D29720
llvm-svn: 294581
|
| |
|
|
| |
llvm-svn: 294565
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does the following.
1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero
2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1)
3. Adds the clzero feature under znver1 architecture.
4. The custom inserter is added in Lowering.
5. A testcase is added to check the intrinsic.
6. The clzero instruction is added to assembler test.
Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me.
Differential revision: https://reviews.llvm.org/D29385
llvm-svn: 294558
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Functions that have a dynamic alloca require a base register which is defined to
be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be
the same register. Use a different callee save register for swifterror instead:
X21 on AArch64
R8 on ARM
rdar://30433803
llvm-svn: 294551
|
| |
|
|
|
|
| |
There's no instruction to implement it.
llvm-svn: 294531
|
| |
|
|
|
|
|
|
| |
It'll usually be immediately legalized back to a libcall, but occasionally
something can be done with it so we'd just as well enable that flexibility from
the start.
llvm-svn: 294530
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'this returns'
We mark X0 as preserved by a call that passes the returned parameter.
x0 = ...
fun(x0) // no implicit def of x0
This no longer is valid if we pass the parameter in a different register then
the returned value as is the case with a swiftself parameter (passed in x20).
x20 = ...
fun(x20) // there should be an implict def of x8
rdar://30425845
llvm-svn: 294527
|
| |
|
|
| |
llvm-svn: 294523
|
| |
|
|
|
|
|
| |
Hopefully this'll be nuked by tablegen pretty soon, but until then it's
reasonably important for supporting C++ operator new[].
llvm-svn: 294520
|
| |
|
|
|
|
|
|
| |
AArch64 has specific instructions to multiply two numbers at double the width
and produce the high part of the result. These can be used to implement LLVM's
mul.with.overflow instructions fairly simply. Helps with C++ operator new[].
llvm-svn: 294519
|
| |
|
|
| |
llvm-svn: 294499
|
| |
|
|
|
|
|
| |
The AAPCS ABI is substantially more complicated so that's coming in a separate
patch. For now we can generate correct code for iOS though.
llvm-svn: 294493
|
| |
|
|
|
|
|
| |
Because we need to preserve the memory access being performed we need a
separate instruction to represent this.
llvm-svn: 294492
|
| |
|
|
| |
llvm-svn: 294462
|
| |
|
|
| |
llvm-svn: 294455
|
| |
|
|
|
|
|
| |
I forgot to remove the neonfp target feature from the test, which means we'd
have trouble selecting VADDS on targets that have neonfp enabled by default.
llvm-svn: 294451
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28760#fb670e28
llvm-svn: 294449
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add a register bank for floating point values and select simple instructions
using them (add, copies from GPR).
This assumes that the hardware can cope with a single precision add (VADDS)
instruction, so the legalizer will treat G_FADD as legal and the instruction
selector will refuse to select if the hardware doesn't support it. In the future
we'll want to be more careful about this, and legalize to libcalls if we have to
use soft float.
llvm-svn: 294442
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch checks the number of operands in the resulting
instruction instead of just the alias, then skips over
tied operands when generating the printing method.
This allows us to generate the preferred assembly syntax
for the AArch64 'ins' instruction, which should always be
displayed as 'mov' according to the ARMARM.
Several unit tests have changed as a result, but only to
reflect the preferred disassembly.
Some other InstAlias patterns (movk/bic/orr) needed a
slight adjustment to stop them becoming the default
and breaking other unit tests.
Patch by Graham Hunter.
Differential Revision: https://reviews.llvm.org/D29219
llvm-svn: 294437
|
| |
|
|
|
|
|
|
|
| |
There are about 3 underlying bugs causing the tests to fail.
On top of that, some tests just we're 'generic' enough. i.e. 32-bit
registers.
llvm-svn: 294434
|
| |
|
|
| |
llvm-svn: 294408
|
| |
|
|
|
|
| |
the feature flag is set.
llvm-svn: 294407
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: As per title.
Reviewers: mkuper, spatel, bkramer, RKSimon, zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29528
llvm-svn: 294394
|
| |
|
|
| |
llvm-svn: 294365
|
| |
|
|
|
|
| |
With particular interest in cases where we don't make use of implicit zeroing or fail to break register dependencies
llvm-svn: 294363
|
| |
|
|
|
|
|
|
|
| |
They are currently modelled incorrectly (as calls, which clobber
registers, confusing e.g. Machine Copy Propagation).
Reverting until we figure out the proper solution.
llvm-svn: 294348
|
| |
|
|
|
|
|
| |
Turns out no-one actually cares about this one (at least) in tree so we can
just drop it entirely.
llvm-svn: 294345
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change allows usage of store instruction for implicit null check.
Memory Aliasing Analisys is not used and change conservatively supposes
that any store and load may access the same memory. As a result
re-ordering of store-store, store-load and load-store is prohibited.
Patch by Serguei Katkov!
Reviewers: reames, sanjoy
Reviewed By: sanjoy
Subscribers: atrick, llvm-commits
Differential Revision: https://reviews.llvm.org/D29400
llvm-svn: 294338
|
| |
|
|
|
|
|
|
|
|
| |
Adds the vnot extended mnemonic for the vnor instruction.
Committing on behalf of brunoalr (Bruno Rosa).
Differential Revision: https://reviews.llvm.org/D29225
llvm-svn: 294330
|
| |
|
|
|
|
|
|
| |
lane masks.
Differential revision: https://reviews.llvm.org/D29442
llvm-svn: 294324
|
| |
|
|
|
|
|
|
|
|
| |
- Map A2_zxtb to A2_andir.
- Map PS_call_nr J2_call.
- Map A2_tfr[t|f][new] to A2_padd[t|f][new].
Patch by Colin LeMahieu.
llvm-svn: 294320
|
| |
|
|
|
|
|
|
| |
Currently we don't support these nodes, so the tests check the current codegen without load combine. This change makes the review of the change to support these nodes more clear.
Separated from https://reviews.llvm.org/D29591 review.
llvm-svn: 294305
|
| |
|
|
|
|
| |
well as intrinsics
llvm-svn: 294300
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When constructing global address literals while targeting the RWPI
relocation model. LLVM currently only uses literal pools. If MOVW/MOVT
instructions are available we can use these instead. Beside being more
efficient it allows -arm-execute-only to work with
-relocation-model=RWPI as well.
When we generate MOVW/MOVT for global addresses when targeting the RWPI
relocation model, we need to use base relative relocations. This patch
does the needed plumbing in MC to generate these for MOVW/MOVT.
Differential Revision: https://reviews.llvm.org/D29487
Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d
llvm-svn: 294298
|
| |
|
|
|
|
| |
Exposes some poor codegen with identity shuffle due to bad interaction with insert_subvector(extract_subvector) / concat_subvectors
llvm-svn: 294296
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r294186.
On an internal test, this triggers an out-of-memory error on PPC,
presumably because there is another dagcombine that does the exact
opposite triggering and endless loop consuming more and more memory.
Chandler has started at creating a reduced test case and we'll attach it
as soon as possible.
llvm-svn: 294288
|