summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [mips][microMIPS] Add CodeGen support for ADD, ADDIU*, ADDU* and DADD* ↵Zlatko Buljan2016-04-081-8/+317
| | | | | | | | instructions Differential Revision: http://reviews.llvm.org/D16454 llvm-svn: 265772
* [IFUNC] Fix ifunc-asm.ll testDmitry Polukhin2016-04-081-0/+15
| | | | | | | It seems that llc cannot be called used in assembler tests so test that checks asm for particular target needs to be moved to codegen. llvm-svn: 265770
* Do not select EhPad BB in MachineBlockPlacement when there is regular BB to ↵Amaury Sechet2016-04-075-63/+104
| | | | | | | | | | | | | | | | | schedule Summary: EHPad BB are not entered the classic way and therefor do not need to be placed after their predecessors. This patch make sure EHPad BB are not chosen amongst successors to form chains, and are selected as last resort when selecting the best candidate. EHPad are scheduled in reverse probability order in order to have them flow into each others naturally. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17625 llvm-svn: 265726
* AMDGPU/SI: Implement atomic load/store for i32 and i64Jan Vesely2016-04-071-0/+178
| | | | | | | | | | Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709
* AMDGPU/SI: Add latency for export instructionsTom Stellard2016-04-071-5/+4
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18599 llvm-svn: 265708
* [X86][SSE] Added bitmask pattern shuffle testsSimon Pilgrim2016-04-072-0/+195
| | | | | | Based on OR(AND(MASK,V0),AND(~MASK,V1)) style patterns llvm-svn: 265697
* [X86]: Fix for PR27251.Kevin B. Smith2016-04-071-2/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D18850 llvm-svn: 265690
* [SystemZ] Implement conditional returnsUlrich Weigand2016-04-0773-729/+742
| | | | | | | | | | | | | | | | | | Return is now considered a predicable instruction, and is converted to a newly-added CondReturn (which maps to BCR to %r14) instruction by the if conversion pass. Also, fused compare-and-branch transform knows about conditional returns, emitting the proper fused instructions for them. This transform triggers on a *lot* of tests, hence the huge diffstat. The changes are mostly jX to br %r14 -> bXr %r14. Author: koriakin Differential Revision: http://reviews.llvm.org/D17339 llvm-svn: 265689
* [PPC] Enable transformations in PPCPassConfig::addIRPasses at O2Ehsan Amiri2016-04-0729-345/+171
| | | | | | | | | | | | | | http://reviews.llvm.org/D18562 A large number of testcases has been modified so they pass after this test. One testcase is deleted, because I realized even after undoing the original change that was committed with this testcase, the testcase still passes. So I removed it. The change to one other testcase (test/CodeGen/PowerPC/pr25802.ll) is an arbitrary change to keep it passing. Given the original intention of the testcase, and the fact that fixing it will require some time to change the testcase, we concluded that this quick change will be enough. llvm-svn: 265683
* [X86][SSE] Add support for VZEXT constant foldingSimon Pilgrim2016-04-073-6/+3
| | | | llvm-svn: 265646
* [X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.Ahmed Bougacha2016-04-071-24/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-apply r265450 which caused PR27245 and was reverted in r265559 because of a wrong generalization: the fetch_and_add->add_and_fetch combine only works in specific, but pretty common, cases: (icmp slt x, 0) -> (icmp sle (add x, 1), 0) (icmp sge x, 0) -> (icmp sgt (add x, 1), 0) (icmp sle x, 0) -> (icmp slt (sub x, 1), 0) (icmp sgt x, 0) -> (icmp sge (sub x, 1), 0) Original Message: We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265636
* [X86] Refresh and tweak EFLAGS reuse tests. NFC.Ahmed Bougacha2016-04-071-51/+82
| | | | | | The non-1 and EQ/NE tests were misguided. llvm-svn: 265635
* Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵Hans Wennborg2016-04-078-17/+75
| | | | | | | | | | | | | | | | eliminateCallFramePseudoInstr (PR27140)" Third time's the charm? The previous attempt (r265345) caused ASan test failures on X86, as broken CFI caused stack traces to not work. This version of the patch makes sure not to merge with stack adjustments that have CFI, and to not add merged instructions' offests to the CFI about to be generated. This is already covered by the lit tests; I just got the expectations wrong previously. llvm-svn: 265623
* [PPC] Use VSX/FP Facility integer load when an integer load's only users are ↵Ehsan Amiri2016-04-061-0/+83
| | | | | | | | | | | conversion to FP http://reviews.llvm.org/D18405 When the integer value loaded is never used directly as integer we should use VSX or Floating Point Facility integer loads and avoid extra direct move llvm-svn: 265593
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-0680-510/+430
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* Revert r265450 "[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC."Hans Wennborg2016-04-061-5/+15
| | | | | | It caused ASan 32-bit tests to hang (PR27245). llvm-svn: 265559
* Revert "Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵Hans Wennborg2016-04-068-77/+19
| | | | | | | | | eliminateCallFramePseudoInstr (PR27140)"" It seems to be causing ASan tests to crash, probably due to miscompiling the run-time somehow. llvm-svn: 265551
* Recommit r265309 after fixed an invalid memory reference bug happenedWei Mi2016-04-065-518/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when DenseMap growed and moved memory. I verified it fixed the bootstrap problem on x86_64-linux-gnu but I cannot verify whether it fixes the bootstrap error on clang-ppc64be-linux. I will watch the build-bot result closely. Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265547
* [ppc64] Temporary disable sibling call optimization on ppc64 due to breaking ↵Chuang-Yu Cheng2016-04-062-7/+7
| | | | | | | | | | | test case r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap test. http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708 llvm-svn: 265528
* [ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abiChuang-Yu Cheng2016-04-063-1/+239
| | | | | | | | | | | | | | | | | This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and add a couple of test cases. This patch also passed llvm/clang bootstrap test, and spec2006 build/run/result validation. Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617 Great thanks to Tom's (tjablin) help, he contributed a lot to this patch. Thanks Hal and Kit's invaluable opinions! Reviewers: hfinkel kbarton http://reviews.llvm.org/D16315 llvm-svn: 265506
* Lower @llvm.experimental.deoptimize as a noreturn callSanjoy Das2016-04-061-6/+0
| | | | | | | | | | | | | | | | | | | | | | | While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502
* [DebugInfo] Fix tests so that each subprogram belongs to a CU.Davide Italiano2016-04-052-4/+5
| | | | llvm-svn: 265490
* Swift Calling Convention: swiftcc for ARM.Manman Ren2016-04-052-0/+201
| | | | | | Differential Revision: http://reviews.llvm.org/D18769 llvm-svn: 265482
* Faster stack-protector for Android/AArch64.Evgeniy Stepanov2016-04-052-0/+44
| | | | | | | Bionic has a defined thread-local location for the stack protector cookie. Emit a direct load instead of going through __stack_chk_guard. llvm-svn: 265481
* Swift Calling Convention: add swiftcc.Manman Ren2016-04-051-0/+201
| | | | | | Differential Revision: http://reviews.llvm.org/D17863 llvm-svn: 265480
* [X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.Ahmed Bougacha2016-04-051-15/+5
| | | | | | | | | | | | | | | | | | | | We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265450
* [X86] Add tests for ATOMIC_LOAD_OP EFLAGS reuse. NFC.Ahmed Bougacha2016-04-051-0/+159
| | | | llvm-svn: 265448
* [AArch64][Test] Do not override the suffixes for test cases.Quentin Colombet2016-04-052-3/+1
| | | | llvm-svn: 265441
* [x86] regenerate checksSanjay Patel2016-04-053-43/+86
| | | | | | | | | | | utils/update_test_checks.py was improved with: http://reviews.llvm.org/rL265414 to include the first line of the function (expected to be a comment line). This ensures that nothing bad has happened before the first actual line of checked asm. It also matches the existing behavior of the old script. llvm-svn: 265416
* WebAssembly: fix cfg-stackify testJF Bastien2016-04-051-10/+10
| | | | | | It was broken by reshuffling induced by r265397 'Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge'. llvm-svn: 265415
* [lanai] LanaiSetflagAluCombiner more conservativeJacques Pienaar2016-04-051-3/+47
| | | | | | | | | | | | Summary: LanaiSetflagAluCombiner could previously combine instructions across basic building blocks even when not legal. Make the LanaiSetflagAluCombiner more conservative to avoid this. Reviewers: eliben Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18746 llvm-svn: 265411
* [AMDGPU] Emit linkonce and linkonce_odr symbolsKonstantin Zhuravlyov2016-04-051-0/+56
| | | | | | Differential Revision: http://reviews.llvm.org/D18726 llvm-svn: 265408
* Add missing test for the "Don't delete empty preheaders" added in r265397Chuang-Yu Cheng2016-04-051-0/+39
| | | | | Author: Tom Jablin (tjablin) llvm-svn: 265402
* Don't delete empty preheaders in CodeGenPrepare if it would create a ↵Chuang-Yu Cheng2016-04-0514-34/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | critical edge Presently, CodeGenPrepare deletes all nearly empty (only phi and branch) basic blocks. This pass can delete loop preheaders which frequently creates critical edges. A preheader can be a convenient place to spill registers to the stack. If the entrance to a loop body is a critical edge, then spills may occur in the loop body rather than immediately before it. This patch protects loop preheaders from deletion in CodeGenPrepare even if they are nearly empty. Since the patch alters the CFG, it affects a large number of test cases. In most cases, the changes are merely cosmetic (basic blocks have different names or instruction orders change slightly). I am somewhat concerned about the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not deleted, then the MIPS backend does not take advantage of a branch delay slot. Consequently, I would like some close review by a MIPS expert. The patch also partially subsumes D16893 from George Burgess IV. George correctly notes that CodeGenPrepare does not actually preserve the dominator tree. I think the dominator tree was usually not valid when CodeGenPrepare ran, but I am using LoopInfo to mark preheaders, so the dominator tree is now always valid before CodeGenPrepare. Author: Tom Jablin (tjablin) Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng http://reviews.llvm.org/D16984 llvm-svn: 265397
* [mips] MIPSR6 Compact jump supportSimon Dardis2016-04-055-75/+152
| | | | | | | | | | | | | | | This patch adds support for compact jumps similiar to the previous compact branch support for MIPSR6. Unlike compact branches, compact jumps do not have a forbidden slot. As MipsInstrInfo::getEquivalentCompactForm can determine the correct expansion for jumps and branches for both microMIPS and MIPSR6, remove the unnecessary distinction in the delay slot filler. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders llvm-svn: 265390
* [NVPTX] Handle ldg created from sign-/zero-extended loadJustin Holewinski2016-04-051-0/+57
| | | | | | | | | | Reviewers: jingyue Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D18053 llvm-svn: 265389
* Don't fold double constant to an integer if dest type not integralTeresa Johnson2016-04-041-0/+12
| | | | | | | | | | | | | | Summary: I encountered this issue when constant folding during inlining tried to fold away a bitcast of a double to an x86_mmx, which is not an integral type. The test case exposes the same issue with a smaller code snippet during early CSE. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18528 llvm-svn: 265367
* test: Always treat .mir files as tests even outside of CodeGen/MIRMatthias Braun2016-04-044-3/+3
| | | | | | | | | We missed a handful of .mir tests that existed outside the test/CodeGen/MIR directory. Also fix the three powerpc .mir tests that nobody noticed were broken. llvm-svn: 265350
* Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵Hans Wennborg2016-04-048-18/+77
| | | | | | | | | | | | | | | | | | | | eliminateCallFramePseudoInstr (PR27140)" The original commit miscompiled things on 32-bit Windows, e.g. a Clang boostrap. It turns out that mergeSPUpdates() was a bit too generous in what it interpreted as a stack adjustment, causing the following code: addl $12, %esp leal -4(%ebp), %esp To be "optimized" into simply: addl $8, %esp This commit tightens up mergeSPUpdates() and includes a new test (test14 in movtopush.ll) for this situation. llvm-svn: 265345
* Beef up some dllexport tests.Sean Silva2016-04-041-1/+23
| | | | | | | | | | | | | | | | Adds some dllexport tests to verify that: - Variables in bss are exported appropriately - Non-dllexport symbols aliased to dllexport symbols are not exported - Symbols declared as dllexport but are not defined are not exported We plan to enable dllimport/dllexport support for the PS4, and these additional tests are for points we noticed in our internal testing. Patch by Warren Ristow! Differential Revision: http://reviews.llvm.org/D18682 llvm-svn: 265333
* ARM, AArch64, X86: Check preserved registers for tail calls.Matthias Braun2016-04-042-0/+46
| | | | | | | | | | | | | | | | We can only perform a tail call to a callee that preserves all the registers that the caller needs to preserve. This situation happens with calling conventions like preserver_mostcc or cxx_fast_tls. It was explicitely handled for fast_tls and failing for preserve_most. This patch generalizes the check to any calling convention. Related to rdar://24207743 Differential Revision: http://reviews.llvm.org/D18680 llvm-svn: 265329
* Revert r265309 and r265312 because they caused some errors I need to ↵Wei Mi2016-04-045-200/+518
| | | | | | investigate. llvm-svn: 265317
* Add MachineFunctionProperty checks for AllVRegsAllocated for target passesDerek Schuff2016-04-041-0/+1
| | | | | | | | | | | | | | Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313
* Replace analyzeSiblingValues with new algorithm to fix its compileWei Mi2016-04-045-518/+200
| | | | | | | | | | | | | | | | | | | | | | | | | time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265309
* [SystemZ] Support ATOMIC_FENCEUlrich Weigand2016-04-042-0/+29
| | | | | | | | | | | A cross-thread sequentially consistent fence should be lowered into z/Architecture's BCR serialization instruction, instead of causing a fatal error in the back-end. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18644 llvm-svn: 265292
* [SystemZ] Support llvm.frameaddress/llvm.returnaddress intrinsicsUlrich Weigand2016-04-042-0/+43
| | | | | | | | | | | Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which previously would cause the back-end to crash. Currently, only a frame count of zero is supported. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18514 llvm-svn: 265291
* AVX-512: Truncating store for i1 vectorsElena Demikhovsky2016-04-042-416/+170
| | | | | | | | | Implemented truncstore for KNL and skylake-avx512. Covered vectors from v2i1 to v64i1. We save the value in bits (not in bytes) - v32i1 is saved in 4 bytes. Differential Revision: http://reviews.llvm.org/D18740 llvm-svn: 265283
* [X86][SSE] Refreshed MOVMSK sign bit testsSimon Pilgrim2016-04-031-26/+48
| | | | llvm-svn: 265267
* AVX-512: Load and Extended Load for i1 vectorsElena Demikhovsky2016-04-034-906/+139
| | | | | | | | | | Implemented load+{sign|zero}_extend for i1 vectors Fixed failures in i1 vector load. Covered loading of v2i1, v4i1, v8i1, v16i1, v32i1, v64i1 vectors for KNL and SKX. Differential Revision: http://reviews.llvm.org/D18737 llvm-svn: 265259
* [mips][microMIPS] Revert commits r264245 and r264248.Zoran Jovanovic2016-04-027-567/+9
| | | | | | | Commit r264245 was the reason for failing tests in LLVM test suite. Commit r264248 depends on the first one. llvm-svn: 265249
OpenPOWER on IntegriCloud