summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Add support for armv7ve triple in llvm (PR31358).George Burgess IV2017-02-092-0/+27
| | | | | | | | | | | | | | | Gcc supports target armv7ve which is armv7-a with virtualization extensions. This change adds support for this in llvm for gcc compatibility. Also remove redundant FeatureHWDiv, FeatureHWDivARM for a few models as this is specified automatically by FeatureVirtualization. Patch by Manoj Gupta. Differential Revision: https://reviews.llvm.org/D29472 llvm-svn: 294661
* X86: Introduce relocImm-based patterns for cmp.Peter Collingbourne2017-02-091-0/+39
| | | | | | Differential Revision: https://reviews.llvm.org/D28690 llvm-svn: 294636
* AMDGPU: Add pass to expand memcpy/memmove/memsetMatt Arsenault2017-02-091-0/+117
| | | | llvm-svn: 294635
* X86: Teach X86InstrInfo::analyzeCompare to recognize compares of symbols.Peter Collingbourne2017-02-091-1/+1
| | | | | | | | | | | | | | | This requires that we communicate to X86InstrInfo::optimizeCompareInstr that the second operand is neither a register nor an immediate. The way we do that is by setting CmpMask to zero. Note that there were already instructions where the second operand was not a register nor an immediate, namely X86::SUB*rm, so also set CmpMask to zero for those instructions. This seems like a latent bug, but I was unable to trigger it. Differential Revision: https://reviews.llvm.org/D28621 llvm-svn: 294634
* [AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of ↵Konstantin Zhuravlyov2017-02-091-3/+3
| | | | | | | | using switch statement Differential Revision: https://reviews.llvm.org/D29741 llvm-svn: 294627
* [SelectionDAG] Fix bugs in inverted condition splitting code.Geoff Berry2017-02-091-4/+66
| | | | | | | | | | | | | | | | | | Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization. Reviewers: bogner, MatzeB, qcolombet Subscribers: mcrosier, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D29680 llvm-svn: 294605
* [X86][BMI2] Regenerate mulx testsSimon Pilgrim2017-02-092-14/+27
| | | | llvm-svn: 294598
* Revert: "[Stack Protection] Add diagnostic information for why stack ↵David Bozier2017-02-091-87/+0
| | | | | | | | protection was applied to a function" this reverts revision r294590 as it broke some buildbots. llvm-svn: 294593
* Add DAGCombiner load combine tests for partially available valuesArtur Pilipenko2017-02-095-2/+817
| | | | | | If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear. llvm-svn: 294591
* [Stack Protection] Add diagnostic information for why stack protection was ↵David Bozier2017-02-091-0/+87
| | | | | | | | | | | | | | applied to a function Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 294590
* [X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast ↵Pierre Gousseau2017-02-091-0/+31
| | | | | | | | | | math. In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants. Differential Revision: https://reviews.llvm.org/D29756 llvm-svn: 294588
* [X86][SSE] Added extra FMA/NO-FMA reciprocal test cases for D26855Simon Pilgrim2017-02-092-12/+338
| | | | | | Test for expected codegen for nr reciprocal cases with/without FMA llvm-svn: 294587
* [ARM] GlobalISel: Lower single precision FP argsDiana Picus2017-02-092-2/+71
| | | | | | | Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets because we don't support libcalls yet. llvm-svn: 294584
* [DAGCombiner] Support non-zero offset in load combineArtur Pilipenko2017-02-095-309/+96
| | | | | | | | | | | | | | | Enable folding patterns which load the value from non-zero offset: i8 *a = ... i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24) => i32 val = *((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582
* [X86][SSE] Attempt to break register dependencies during lowerBuildVectorSimon Pilgrim2017-02-0911-70/+70
| | | | | | | | | | | | LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register. This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD. On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.) Differential Revision: https://reviews.llvm.org/D29720 llvm-svn: 294581
* Add new tests for EXTRACT_VECTOR_ELT (vector of packed i8/16/i32/i64/ps/pd data)Igor Breger2017-02-091-1/+427
| | | | llvm-svn: 294565
* [X86] Clzero intrinsic and its addition under znver1Craig Topper2017-02-091-0/+23
| | | | | | | | | | | | | | | | | This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 llvm-svn: 294558
* SwiftCC: swifterror register cannot be as the base registerArnold Schwaighofer2017-02-092-161/+161
| | | | | | | | | | | | | Functions that have a dynamic alloca require a base register which is defined to be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be the same register. Use a different callee save register for swifterror instead: X21 on AArch64 R8 on ARM rdar://30433803 llvm-svn: 294551
* GlobalISel: legalize G_FPOW to a libcall on AArch64.Tim Northover2017-02-081-0/+35
| | | | | | There's no instruction to implement it. llvm-svn: 294531
* GlobalISel: translate @llvm.pow intrinsic to G_FPOW.Tim Northover2017-02-081-0/+11
| | | | | | | | It'll usually be immediately legalized back to a libcall, but occasionally something can be done with it so we'd just as well enable that flexibility from the start. llvm-svn: 294530
* [ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not ↵Arnold Schwaighofer2017-02-082-0/+35
| | | | | | | | | | | | | | | | | | | 'this returns' We mark X0 as preserved by a call that passes the returned parameter. x0 = ... fun(x0) // no implicit def of x0 This no longer is valid if we pass the parameter in a different register then the returned value as is the case with a swiftself parameter (passed in x20). x20 = ... fun(x20) // there should be an implict def of x8 rdar://30425845 llvm-svn: 294527
* Revert r294437 as it broke an asan buildbot.Amara Emerson2017-02-0820-174/+174
| | | | llvm-svn: 294523
* GlobalISel: select G_[SU]MULH on AArch64.Tim Northover2017-02-081-0/+30
| | | | | | | Hopefully this'll be nuked by tablegen pretty soon, but until then it's reasonably important for supporting C++ operator new[]. llvm-svn: 294520
* GlobalISel: expand mul-with-overflow into mul-hi on AArch64.Tim Northover2017-02-081-0/+20
| | | | | | | | AArch64 has specific instructions to multiply two numbers at double the width and produce the high part of the result. These can be used to implement LLVM's mul.with.overflow instructions fairly simply. Helps with C++ operator new[]. llvm-svn: 294519
* [X86][SSE] Regenerate scalar integer conversions to float testsSimon Pilgrim2017-02-081-100/+661
| | | | llvm-svn: 294499
* GlobalISel: select G_VASTART on iOS AArch64.Tim Northover2017-02-081-0/+16
| | | | | | | The AAPCS ABI is substantially more complicated so that's coming in a separate patch. For now we can generate correct code for iOS though. llvm-svn: 294493
* GlobalISel: translate @llvm.va_start intrinsic.Tim Northover2017-02-081-0/+13
| | | | | | | Because we need to preserve the memory access being performed we need a separate instruction to represent this. llvm-svn: 294492
* [x86] add AVX512vl target for more coverage; NFCSanjay Patel2017-02-081-47/+122
| | | | llvm-svn: 294462
* Add test case for pr31890. NFCAmaury Sechet2017-02-081-0/+58
| | | | llvm-svn: 294455
* Fix test to work on swift/cyclone tooDiana Picus2017-02-081-1/+1
| | | | | | | I forgot to remove the neonfp target feature from the test, which means we'd have trouble selecting VADDS on targets that have neonfp enabled by default. llvm-svn: 294451
* [AMDGPU] Add target information that is required by tools to metadataKonstantin Zhuravlyov2017-02-084-7/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449
* [ARM] GlobalISel: Add FPR reg bankDiana Picus2017-02-083-0/+92
| | | | | | | | | | | | | Add a register bank for floating point values and select simple instructions using them (add, copies from GPR). This assumes that the hardware can cope with a single precision add (VADDS) instruction, so the legalizer will treat G_FADD as legal and the instruction selector will refuse to select if the hardware doesn't support it. In the future we'll want to be more careful about this, and legalize to libcalls if we have to use soft float. llvm-svn: 294442
* [AArch64][TableGen] Skip tied result operands for InstAliasAmara Emerson2017-02-0820-174/+174
| | | | | | | | | | | | | | | | | | | | | | | This patch checks the number of operands in the resulting instruction instead of just the alias, then skips over tied operands when generating the printing method. This allows us to generate the preferred assembly syntax for the AArch64 'ins' instruction, which should always be displayed as 'mov' according to the ARMARM. Several unit tests have changed as a result, but only to reflect the preferred disassembly. Some other InstAlias patterns (movk/bic/orr) needed a slight adjustment to stop them becoming the default and breaking other unit tests. Patch by Graham Hunter. Differential Revision: https://reviews.llvm.org/D29219 llvm-svn: 294437
* [AVR] XFAIL a set of failing CodeGen testsDylan McKay2017-02-0810-0/+33
| | | | | | | | | There are about 3 underlying bugs causing the tests to fail. On top of that, some tests just we're 'generic' enough. i.e. 32-bit registers. llvm-svn: 294434
* AMDGPU: Enable InferAddressSpacesMatt Arsenault2017-02-082-22/+22
| | | | llvm-svn: 294408
* [X86] Add test for clflushopt intrinsic and only enable it to be selected if ↵Craig Topper2017-02-081-0/+13
| | | | | | the feature flag is set. llvm-svn: 294407
* [DAGCombiner] Push truncate through adde when the carry isn't used.Amaury Sechet2017-02-082-9/+5
| | | | | | | | | | | | Summary: As per title. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29528 llvm-svn: 294394
* [X86][SSE] Add SSE2 build vector insertion testsSimon Pilgrim2017-02-071-105/+285
| | | | llvm-svn: 294365
* [X86][SSE] Add additional v4i32/v8i16/v16i8 build vector insertion testsSimon Pilgrim2017-02-071-0/+272
| | | | | | With particular interest in cases where we don't make use of implicit zeroing or fail to break register dependencies llvm-svn: 294363
* [X86] Disable conditional tail calls (PR31257)Hans Wennborg2017-02-073-140/+4
| | | | | | | | | They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. llvm-svn: 294348
* GlobalISel: translate @llvm.va_end intrinsic.Tim Northover2017-02-071-0/+10
| | | | | | | Turns out no-one actually cares about this one (at least) in tree so we can just drop it entirely. llvm-svn: 294345
* [ImplicitNullCheck] Extend Implicit Null Check scope by using storesSanjoy Das2017-02-073-14/+689
| | | | | | | | | | | | | | | | | | | | | Summary: This change allows usage of store instruction for implicit null check. Memory Aliasing Analisys is not used and change conservatively supposes that any store and load may access the same memory. As a result re-ordering of store-store, store-load and load-store is prohibited. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: atrick, llvm-commits Differential Revision: https://reviews.llvm.org/D29400 llvm-svn: 294338
* [PowerPC][Altivec] Add vnot extended mnemonicNemanja Ivanovic2017-02-071-20/+20
| | | | | | | | | | Adds the vnot extended mnemonic for the vnor instruction. Committing on behalf of brunoalr (Bruno Rosa). Differential Revision: https://reviews.llvm.org/D29225 llvm-svn: 294330
* [AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should trackAlexander Timofeev2017-02-071-0/+49
| | | | | | | | lane masks. Differential revision: https://reviews.llvm.org/D29442 llvm-svn: 294324
* [Hexagon] Remove encoding bits from mapped instructionsKrzysztof Parzyszek2017-02-072-0/+172
| | | | | | | | | | - Map A2_zxtb to A2_andir. - Map PS_call_nr J2_call. - Map A2_tfr[t|f][new] to A2_padd[t|f][new]. Patch by Colin LeMahieu. llvm-svn: 294320
* Add DAGCombiner load combine tests for {a|s}ext, {a|z|s}ext load nodesArtur Pilipenko2017-02-075-4/+786
| | | | | | | | Currently we don't support these nodes, so the tests check the current codegen without load combine. This change makes the review of the change to support these nodes more clear. Separated from https://reviews.llvm.org/D29591 review. llvm-svn: 294305
* [X86][SSE] Generalized integer absolute tests to test canonical pattern as ↵Simon Pilgrim2017-02-071-6/+31
| | | | | | well as intrinsics llvm-svn: 294300
* [ARM] Make RWPI use movw/movt when availableChristof Douma2017-02-071-19/+125
| | | | | | | | | | | | | | | | | When constructing global address literals while targeting the RWPI relocation model. LLVM currently only uses literal pools. If MOVW/MOVT instructions are available we can use these instead. Beside being more efficient it allows -arm-execute-only to work with -relocation-model=RWPI as well. When we generate MOVW/MOVT for global addresses when targeting the RWPI relocation model, we need to use base relative relocations. This patch does the needed plumbing in MC to generate these for MOVW/MOVT. Differential Revision: https://reviews.llvm.org/D29487 Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d llvm-svn: 294298
* [X86][SSE] Added 256-bit vector tests casesSimon Pilgrim2017-02-071-0/+1151
| | | | | | Exposes some poor codegen with identity shuffle due to bad interaction with insert_subvector(extract_subvector) / concat_subvectors llvm-svn: 294296
* Revert "[DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry)"Daniel Jasper2017-02-071-6/+7
| | | | | | | | | | | | | This reverts commit r294186. On an internal test, this triggers an out-of-memory error on PPC, presumably because there is another dagcombine that does the exact opposite triggering and endless loop consuming more and more memory. Chandler has started at creating a reduced test case and we'll attach it as soon as possible. llvm-svn: 294288
OpenPOWER on IntegriCloud