summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Reapply: Use __rt_div functions for divrem on WindowsMartin Storsjo2016-10-072-35/+23
| | | | | | | | | | | | | | | | | | | | | | | | Reapplying r283383 after revert in r283442. The additional fix is a getting rid of a stray space in a function name, in the refactoring part of the commit. This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D25332 llvm-svn: 283550
* [ARM]: Add Cortex-R52 target to LLVMJaved Absar2016-10-071-0/+34
| | | | | | | This patch adds Cortex-R52, the new ARM real-time processor, to LLVM. Cortex-R52 implements the ARMv8-R architecture. llvm-svn: 283542
* [X86][SSE] Update register class during MOVSD/MOVSS - BLENDPD/BLENDPS ↵Simon Pilgrim2016-10-071-0/+57
| | | | | | | | | | | | | | commutation MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors. This patch inserts a vreg copy mi to ensure that the registers are correct. Fix for PR30607 Differential Revision: https://reviews.llvm.org/D25280 llvm-svn: 283539
* AMDGPU: Fix use-after-free in SIOptimizeExecMaskingNicolai Haehnle2016-10-071-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: There was a bug with sequences like s_mov_b64 s[0:1], exec s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill> ... s_mov_b64_term exec, s[2:3] because s[2:3] was defined and used in the same instruction, ending up with SaveExecInst inside OtherUseInsts. Note that the test case also exposes an unrelated bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028 Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25306 llvm-svn: 283528
* AMDGPU: Change check prefix in testMatt Arsenault2016-10-071-247/+247
| | | | llvm-svn: 283521
* [WebAssemby] Implement block signatures.Dan Gohman2016-10-063-116/+118
| | | | | | | | | Per spec changes, this implements block signatures, and adds just enough logic to produce correct block signatures at the ends of functions. Differential Revision: https://reviews.llvm.org/D25144 llvm-svn: 283503
* [WebAssembly] Remove loop's bottom label.Dan Gohman2016-10-061-17/+31
| | | | | | | | Per spec changes, loop constructs no longer have a bottom label. https://reviews.llvm.org/D25118 llvm-svn: 283502
* [WebAssembly] Remove the output operand from stores.Dan Gohman2016-10-0615-139/+139
| | | | | | | | | Per spec changes, store instructions in WebAssembly no longer have a return value. Update the instruction descriptions. Differential Revision: https://reviews.llvm.org/D25122 llvm-svn: 283501
* Handle *_EXTEND_VECTOR_INREG during Integer LegalizationPirama Arumuga Nainar2016-10-061-0/+137
| | | | | | | | | | | | | | | | Summary: These nodes need legalization for 3-element vectors. This commit handles the legalization and adds tests for zext and sext. This fixes PR30614. Reviewers: RKSimon, srhines Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25268 llvm-svn: 283496
* [X86] Preserve BasePtr for LEA64_32rMichael Kuperstein2016-10-061-0/+44
| | | | | | | | | | | | When replacing FrameIndex with BasePtr, we must preserve BasePtr for LEA64_32r since BasePtr is used later for stack adjustment if it is the same as StackPtr. Patch by H.J Lu <hjl.tools@gmail.com> Differential Revision: https://reviews.llvm.org/D23575 llvm-svn: 283486
* [X86][SSE] Add f16/f80/f128 vector sitofp test casesSimon Pilgrim2016-10-061-0/+206
| | | | | | As discussed on D23808 llvm-svn: 283485
* [DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputsMichael Kuperstein2016-10-064-1025/+691
| | | | | | | | | | | | | | | | This generalizes the build_vector -> vector_shuffle combine to support any number of inputs. The idea is to create a binary tree of shuffles, where the first layer performs pairwise shuffles of the input vectors placing each input element into the correct lane, and the rest of the tree blends these shuffles together. This doesn't try to be smart and create any sort of "optimal" shuffles. The assumption is that even a "poor" shuffle sequence is better than extracting and inserting the elements one by one. Differential Revision: https://reviews.llvm.org/D24683 llvm-svn: 283480
* BranchRelaxation: Support expanding unconditional branchesMatt Arsenault2016-10-064-13/+737
| | | | | | | AMDGPU needs to expand unconditional branches in a new block with an indirect branch. llvm-svn: 283464
* [Hexagon] Avoid replacing full regs with subregisters in tied operandsKrzysztof Parzyszek2016-10-061-0/+23
| | | | | | Doing so will result in the two-address pass generating incorrect code. llvm-svn: 283463
* BranchRelaxation: Account for function alignmentMatt Arsenault2016-10-061-0/+29
| | | | llvm-svn: 283462
* Move AArch64BranchRelaxation to generic codeMatt Arsenault2016-10-063-3/+3
| | | | llvm-svn: 283459
* Revert "[ARM] Use __rt_div functions for divrem on Windows"Diana Picus2016-10-062-23/+35
| | | | | | | | | | This reverts commit r283383 because it broke some of the bots: undefined reference to ` __aeabi_uldivmod' It affected (at least) clang-cmake-armv7-a15-selfhost, clang-cmake-armv7-a15-selfhost and clang-native-arm-lnt. llvm-svn: 283442
* Add test-cases which demontrate pr30561Zvi Rackover2016-10-061-0/+26
| | | | llvm-svn: 283436
* [ARM] Constant pool promotion - fix alignment calculationJames Molloy2016-10-061-0/+10
| | | | | | | | | Global variables are GlobalValues, so they have explicit alignment. Querying DataLayout for the alignment was incorrect. Testcase added. llvm-svn: 283423
* [ARM] Improve testcase for r283323James Molloy2016-10-061-2/+1
| | | | | | | | | We can work around a shortcoming of FileCheck by using {{\[}} to match a square bracket before a [[ sequence. Thanks to Eli Friedman for the heads up! llvm-svn: 283422
* [AMDGPU] Promote uniform i16 bitreverse intrinsic to i32Konstantin Zhuravlyov2016-10-061-416/+600
| | | | | | Differential Revision: https://reviews.llvm.org/D25121 llvm-svn: 283415
* [DAG] add tests to show missing checks for SDNode FMFSanjay Patel2016-10-051-4/+111
| | | | | | The AVX attribute is added to remove noise caused by SSE's destructive insts. llvm-svn: 283410
* Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace.Adrian Prantl2016-10-051-49/+41
| | | | | | | | | | | | | This came out of a discussion in https://reviews.llvm.org/D25285. There used to be various other llvm.dbg.* nodes, but we don't support upgrading them and we want to reserve the namespace for future uses. This also removes an entirely obsolete and bitrotted testcase for PR7662. Reapplies 283390 with a forgotten testcase. llvm-svn: 283400
* Revert "Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace."Adrian Prantl2016-10-051-41/+49
| | | | | | Forgot to add a testcase in r283390. llvm-svn: 283399
* [DAG] change test to use 'unsafe' function attribute instead of global settingSanjay Patel2016-10-051-4/+11
| | | | | | But we have node-level FMF, so the next step is to fix this at the instruction/node-level. llvm-svn: 283393
* Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace.Adrian Prantl2016-10-051-49/+41
| | | | | | | | | | | This came out of a discussion in https://reviews.llvm.org/D25285. There used to be various other llvm.dbg.* nodes, but we don't support upgrading them and we want to reserve the namespace for future uses. This also removes an entirely obsolete and bitrotted testcase for PR7662. llvm-svn: 283390
* [ARM] Use __rt_div functions for divrem on WindowsMartin Storsjo2016-10-052-35/+23
| | | | | | | | | | | | | | | | | | | | This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D24076 llvm-svn: 283383
* [Sparc] Implement UMUL_LOHI and SMUL_LOHI instead of MULHS/MULHU/MUL.James Y Knight2016-10-051-4/+2
| | | | | | | This is what the instruction-set actually provides, and the default expansions of the others into the lohi opcodes are good. llvm-svn: 283381
* Improve the debug-info test created in r274263.Yunzhong Gao2016-10-051-3/+20
| | | | | | | | | | | | | This patch is related to r274263 or Phabricator/D21818. This patch aims to improve the test case added in the previous commit to verify specifically that the stack protector pass is adding the debug line info as intended. Before, the test only verified that the verifier pass does not crash. The current approach is to generate the assembly output and then look for the .loc directive. Differential Revision: https://reviews.llvm.org/D25290 llvm-svn: 283374
* [RDF] Fix live def propagation through basic blockKrzysztof Parzyszek2016-10-051-0/+214
| | | | llvm-svn: 283371
* [DAG] Teach computeKnownBits and ComputeNumSignBits in SelectionDAG to look ↵Bjorn Pettersson2016-10-053-3/+23
| | | | | | | | | | | | | | | | | through EXTRACT_VECTOR_ELT. Summary: Both computeKnownBits and ComputeNumSignBits can now do a simple look-through of EXTRACT_VECTOR_ELT. It will compute the result based on the known bits (or known sign bits) for the vector that the element is extracted from. Reviewers: bogner, tstellarAMD, mkuper Subscribers: wdng, RKSimon, jyknight, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D25007 llvm-svn: 283347
* Test commit permission. NFCBjorn Pettersson2016-10-051-1/+1
| | | | llvm-svn: 283346
* Revert r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions ↵Hans Wennborg2016-10-051-25/+2
| | | | | | | | | (PR26302)" This is suspected to cause a miscompile in Chromium. Reverting while investigating. llvm-svn: 283329
* [Thumb] Don't try and emit LDRH/LDRB from the constant poolJames Molloy2016-10-051-0/+22
| | | | | | | | This is not a valid encoding - these instructions cannot do PC-relative addressing. The underlying problem here is of whitelist in ARMISelDAGToDAG that unwraps ARMISD::Wrappers during addressing-mode selection. This didn't realise TargetConstantPool was actually possible, so didn't handle it. llvm-svn: 283323
* Test commit permissionOren Ben Simhon2016-10-051-1/+1
| | | | llvm-svn: 283319
* Fix machine operand traversal in ScheduleDAGInstrs::fixupKillsKrzysztof Parzyszek2016-10-051-0/+37
| | | | llvm-svn: 283315
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-0518-340/+47
| | | | | | | | | | This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-0418-47/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274
* [Target] move reciprocal estimate settings from TargetOptions to TargetLoweringSanjay Patel2016-10-044-204/+297
| | | | | | | | | | | | | | | | | | | | | | | | | | The motivation for the change is that we can't have pseudo-global settings for codegen living in TargetOptions because that doesn't work with LTO. Ideally, these reciprocal attributes will be moved to the instruction-level via FMF, metadata, or something else. But making them function attributes is at least an improvement over the current state. The ingredients of this patch are: Remove the reciprocal estimate command-line debug option. Add TargetRecip to TargetLowering. Remove TargetRecip from TargetOptions. Clean up the TargetRecip implementation to work with this new scheme. Set the default reciprocal settings in TargetLoweringBase (everything is off). Update the PowerPC defaults, users, and tests. Update the x86 defaults, users, and tests. Note that if this patch needs to be reverted, the related clang patch checked in at r283251 should be reverted too. Differential Revision: https://reviews.llvm.org/D24816 llvm-svn: 283252
* AArch64: Macrofusion: Split features, add missing combinations.Matthias Braun2016-10-041-1/+1
| | | | | | | | | | | | | | | | AArch64InstrInfo::shouldScheduleAdjacent() determines whether two instruction can benefit from macroop fusion on apple CPUs. The list turned out to be incomplete: - the "rr" variants of the instructions were missing - even the "rs" variants can have shift value == 0 and behave like the "rr" variants This also splits the MacropFusion target feature into ArithmeticBccFusion and ArithmeticCbzFusion. Differential Revision: https://reviews.llvm.org/D25142 llvm-svn: 283243
* [Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register setNemanja Ivanovic2016-10-0414-33/+259
| | | | | | | | | | | | | This patch corresponds to review: The newly added VSX D-Form (register + offset) memory ops target the upper half of the VSX register set. The existing ones target the lower half. In order to unify these and have the ability to target all the VSX registers using D-Form operations, this patch defines Pseudo-ops for the loads/stores which are expanded post-RA. The expansion then choses the correct opcode based on the register that was allocated for the operation. llvm-svn: 283212
* [mips][fastisel] Consider soft-float an unsupported floating point modeSimon Dardis2016-10-041-0/+11
| | | | | | | | | | | Treat soft-float as unsupported for fast-isel. Additionally, ensure we check that lowering f32 arguments also considers the case of soft-float mode. Reviewers: ehostunreach, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24505 llvm-svn: 283209
* Fix a test case failure on Apple PPC.Nemanja Ivanovic2016-10-041-4/+4
| | | | llvm-svn: 283191
* [Power9] Part-word VSX integer scalar loads/stores and sign extend instructionsNemanja Ivanovic2016-10-0415-238/+1307
| | | | | | | | | | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D23155 This patch removes the VSHRC register class (based on D20310) and adds exploitation of the Power9 sub-word integer loads into VSX registers as well as vector sign extensions. The new instructions are useful for a few purposes: Int to Fp conversions of 1 or 2-byte values loaded from memory Building vectors of 1 or 2-byte integers with values loaded from memory Storing individual 1 or 2-byte elements from integer vectors This patch implements all of those uses. llvm-svn: 283190
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-0417-271/+47
| | | | | | | | This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-0417-47/+271
| | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164
* Set some tests to an unknown vendor and OSMatthias Braun2016-10-038-11/+11
| | | | | | | | This avoids llc using the hosts OS/vendor as defaults and triggering unwanted behaviour in the tests. This should deal with the buildbot breakages on windows after r283140. llvm-svn: 283149
* [RDF] Fix liveness propagation through shadowsKrzysztof Parzyszek2016-10-031-0/+73
| | | | | | | | Each shadow only represents data flow that is restricted to its reaching def. Propagating more than that could lead to spurious register liveness, resulting in extra (incorrectly) block live-ins. llvm-svn: 283143
* X86: Do not produce GOT relocations on windowsMatthias Braun2016-10-031-0/+27
| | | | | | | | | | Windows has no GOT relocations the way elf/darwin has. Some people use x86_64-pc-win32-macho to build EFI firmware; Do not produce GOT relocations for this target. Differential Revision: https://reviews.llvm.org/D24627 llvm-svn: 283140
* [AMDGPU] Sign extend AShr when promoting (instead of zero extending)Konstantin Zhuravlyov2016-10-031-8/+8
| | | | llvm-svn: 283130
OpenPOWER on IntegriCloud