summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Revert r265817Colin LeMahieu2016-04-088-13/+13
| | | | | | lld tests need to be addressed. llvm-svn: 265822
* [llvm-objdump] Printing hex instead of dec by defaultColin LeMahieu2016-04-088-13/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D18770 llvm-svn: 265817
* [SystemZ] Support conditional sibling calls via BRCLUlrich Weigand2016-04-081-0/+369
| | | | | | | | | | | | This adds a conditional variant of CallJG instruction, CallBRCL. It can be used for conditional sibling calls. Unfortunately, due to IfCvt limitations, it only really works well for functions without arguments. Author: koriakin Differential Revision: http://reviews.llvm.org/D18864 llvm-svn: 265814
* [AArch64] Add a test case for the default mapping of RegBankSelect.Quentin Colombet2016-04-081-0/+49
| | | | llvm-svn: 265811
* [MIR] Teach the parser how to deal with register banks.Quentin Colombet2016-04-081-1/+1
| | | | llvm-svn: 265802
* [ARM] Enable SMLAW[B|T] and SMLUW[B|T] instruction selectionSam Parker2016-04-082-26/+105
| | | | | | | | | | Added ISelDAGToDAG functions to enable selection of the smlawb, smlawt, smulwb and smulwt instructions for the ARM backend. Also updated the smul CodeGen test and removed the smulw one. Differential Revision: http://reviews.llvm.org/D18892 llvm-svn: 265793
* Revert r265547 "Recommit r265309 after fixed an invalid memory reference bug ↵Hans Wennborg2016-04-085-199/+518
| | | | | | | | | | | | | happened" It caused PR27275: "ARM: Bad machine code: Using an undefined physical register" Also reverting the following commits that were landed on top: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" llvm-svn: 265790
* [X86][SSE] Added 32-bit tests for vector lzcnt/tzcnt testsSimon Pilgrim2016-04-082-0/+607
| | | | | | v2i64 tests are particularly bad on 32-bit targets. llvm-svn: 265789
* CXX_FAST_TLS calling convention: performance improvement for PPC64Chuang-Yu Cheng2016-04-081-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781
* [mips][microMIPS] Add CodeGen support for ADD, ADDIU*, ADDU* and DADD* ↵Zlatko Buljan2016-04-081-8/+317
| | | | | | | | instructions Differential Revision: http://reviews.llvm.org/D16454 llvm-svn: 265772
* [IFUNC] Fix ifunc-asm.ll testDmitry Polukhin2016-04-081-0/+15
| | | | | | | It seems that llc cannot be called used in assembler tests so test that checks asm for particular target needs to be moved to codegen. llvm-svn: 265770
* Do not select EhPad BB in MachineBlockPlacement when there is regular BB to ↵Amaury Sechet2016-04-075-63/+104
| | | | | | | | | | | | | | | | | schedule Summary: EHPad BB are not entered the classic way and therefor do not need to be placed after their predecessors. This patch make sure EHPad BB are not chosen amongst successors to form chains, and are selected as last resort when selecting the best candidate. EHPad are scheduled in reverse probability order in order to have them flow into each others naturally. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17625 llvm-svn: 265726
* AMDGPU/SI: Implement atomic load/store for i32 and i64Jan Vesely2016-04-071-0/+178
| | | | | | | | | | Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709
* AMDGPU/SI: Add latency for export instructionsTom Stellard2016-04-071-5/+4
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18599 llvm-svn: 265708
* [X86][SSE] Added bitmask pattern shuffle testsSimon Pilgrim2016-04-072-0/+195
| | | | | | Based on OR(AND(MASK,V0),AND(~MASK,V1)) style patterns llvm-svn: 265697
* [X86]: Fix for PR27251.Kevin B. Smith2016-04-071-2/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D18850 llvm-svn: 265690
* [SystemZ] Implement conditional returnsUlrich Weigand2016-04-0773-729/+742
| | | | | | | | | | | | | | | | | | Return is now considered a predicable instruction, and is converted to a newly-added CondReturn (which maps to BCR to %r14) instruction by the if conversion pass. Also, fused compare-and-branch transform knows about conditional returns, emitting the proper fused instructions for them. This transform triggers on a *lot* of tests, hence the huge diffstat. The changes are mostly jX to br %r14 -> bXr %r14. Author: koriakin Differential Revision: http://reviews.llvm.org/D17339 llvm-svn: 265689
* [PPC] Enable transformations in PPCPassConfig::addIRPasses at O2Ehsan Amiri2016-04-0729-345/+171
| | | | | | | | | | | | | | http://reviews.llvm.org/D18562 A large number of testcases has been modified so they pass after this test. One testcase is deleted, because I realized even after undoing the original change that was committed with this testcase, the testcase still passes. So I removed it. The change to one other testcase (test/CodeGen/PowerPC/pr25802.ll) is an arbitrary change to keep it passing. Given the original intention of the testcase, and the fact that fixing it will require some time to change the testcase, we concluded that this quick change will be enough. llvm-svn: 265683
* [X86][SSE] Add support for VZEXT constant foldingSimon Pilgrim2016-04-073-6/+3
| | | | llvm-svn: 265646
* [X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.Ahmed Bougacha2016-04-071-24/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-apply r265450 which caused PR27245 and was reverted in r265559 because of a wrong generalization: the fetch_and_add->add_and_fetch combine only works in specific, but pretty common, cases: (icmp slt x, 0) -> (icmp sle (add x, 1), 0) (icmp sge x, 0) -> (icmp sgt (add x, 1), 0) (icmp sle x, 0) -> (icmp slt (sub x, 1), 0) (icmp sgt x, 0) -> (icmp sge (sub x, 1), 0) Original Message: We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265636
* [X86] Refresh and tweak EFLAGS reuse tests. NFC.Ahmed Bougacha2016-04-071-51/+82
| | | | | | The non-1 and EQ/NE tests were misguided. llvm-svn: 265635
* Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵Hans Wennborg2016-04-078-17/+75
| | | | | | | | | | | | | | | | eliminateCallFramePseudoInstr (PR27140)" Third time's the charm? The previous attempt (r265345) caused ASan test failures on X86, as broken CFI caused stack traces to not work. This version of the patch makes sure not to merge with stack adjustments that have CFI, and to not add merged instructions' offests to the CFI about to be generated. This is already covered by the lit tests; I just got the expectations wrong previously. llvm-svn: 265623
* [PPC] Use VSX/FP Facility integer load when an integer load's only users are ↵Ehsan Amiri2016-04-061-0/+83
| | | | | | | | | | | conversion to FP http://reviews.llvm.org/D18405 When the integer value loaded is never used directly as integer we should use VSX or Floating Point Facility integer loads and avoid extra direct move llvm-svn: 265593
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-0680-510/+430
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* Revert r265450 "[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC."Hans Wennborg2016-04-061-5/+15
| | | | | | It caused ASan 32-bit tests to hang (PR27245). llvm-svn: 265559
* Revert "Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵Hans Wennborg2016-04-068-77/+19
| | | | | | | | | eliminateCallFramePseudoInstr (PR27140)"" It seems to be causing ASan tests to crash, probably due to miscompiling the run-time somehow. llvm-svn: 265551
* Recommit r265309 after fixed an invalid memory reference bug happenedWei Mi2016-04-065-518/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when DenseMap growed and moved memory. I verified it fixed the bootstrap problem on x86_64-linux-gnu but I cannot verify whether it fixes the bootstrap error on clang-ppc64be-linux. I will watch the build-bot result closely. Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265547
* [ppc64] Temporary disable sibling call optimization on ppc64 due to breaking ↵Chuang-Yu Cheng2016-04-062-7/+7
| | | | | | | | | | | test case r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap test. http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708 llvm-svn: 265528
* [ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abiChuang-Yu Cheng2016-04-063-1/+239
| | | | | | | | | | | | | | | | | This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and add a couple of test cases. This patch also passed llvm/clang bootstrap test, and spec2006 build/run/result validation. Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617 Great thanks to Tom's (tjablin) help, he contributed a lot to this patch. Thanks Hal and Kit's invaluable opinions! Reviewers: hfinkel kbarton http://reviews.llvm.org/D16315 llvm-svn: 265506
* Lower @llvm.experimental.deoptimize as a noreturn callSanjoy Das2016-04-061-6/+0
| | | | | | | | | | | | | | | | | | | | | | | While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502
* [DebugInfo] Fix tests so that each subprogram belongs to a CU.Davide Italiano2016-04-052-4/+5
| | | | llvm-svn: 265490
* Swift Calling Convention: swiftcc for ARM.Manman Ren2016-04-052-0/+201
| | | | | | Differential Revision: http://reviews.llvm.org/D18769 llvm-svn: 265482
* Faster stack-protector for Android/AArch64.Evgeniy Stepanov2016-04-052-0/+44
| | | | | | | Bionic has a defined thread-local location for the stack protector cookie. Emit a direct load instead of going through __stack_chk_guard. llvm-svn: 265481
* Swift Calling Convention: add swiftcc.Manman Ren2016-04-051-0/+201
| | | | | | Differential Revision: http://reviews.llvm.org/D17863 llvm-svn: 265480
* [X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.Ahmed Bougacha2016-04-051-15/+5
| | | | | | | | | | | | | | | | | | | | We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265450
* [X86] Add tests for ATOMIC_LOAD_OP EFLAGS reuse. NFC.Ahmed Bougacha2016-04-051-0/+159
| | | | llvm-svn: 265448
* [AArch64][Test] Do not override the suffixes for test cases.Quentin Colombet2016-04-052-3/+1
| | | | llvm-svn: 265441
* [x86] regenerate checksSanjay Patel2016-04-053-43/+86
| | | | | | | | | | | utils/update_test_checks.py was improved with: http://reviews.llvm.org/rL265414 to include the first line of the function (expected to be a comment line). This ensures that nothing bad has happened before the first actual line of checked asm. It also matches the existing behavior of the old script. llvm-svn: 265416
* WebAssembly: fix cfg-stackify testJF Bastien2016-04-051-10/+10
| | | | | | It was broken by reshuffling induced by r265397 'Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge'. llvm-svn: 265415
* [lanai] LanaiSetflagAluCombiner more conservativeJacques Pienaar2016-04-051-3/+47
| | | | | | | | | | | | Summary: LanaiSetflagAluCombiner could previously combine instructions across basic building blocks even when not legal. Make the LanaiSetflagAluCombiner more conservative to avoid this. Reviewers: eliben Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18746 llvm-svn: 265411
* [AMDGPU] Emit linkonce and linkonce_odr symbolsKonstantin Zhuravlyov2016-04-051-0/+56
| | | | | | Differential Revision: http://reviews.llvm.org/D18726 llvm-svn: 265408
* Add missing test for the "Don't delete empty preheaders" added in r265397Chuang-Yu Cheng2016-04-051-0/+39
| | | | | Author: Tom Jablin (tjablin) llvm-svn: 265402
* Don't delete empty preheaders in CodeGenPrepare if it would create a ↵Chuang-Yu Cheng2016-04-0514-34/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | critical edge Presently, CodeGenPrepare deletes all nearly empty (only phi and branch) basic blocks. This pass can delete loop preheaders which frequently creates critical edges. A preheader can be a convenient place to spill registers to the stack. If the entrance to a loop body is a critical edge, then spills may occur in the loop body rather than immediately before it. This patch protects loop preheaders from deletion in CodeGenPrepare even if they are nearly empty. Since the patch alters the CFG, it affects a large number of test cases. In most cases, the changes are merely cosmetic (basic blocks have different names or instruction orders change slightly). I am somewhat concerned about the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not deleted, then the MIPS backend does not take advantage of a branch delay slot. Consequently, I would like some close review by a MIPS expert. The patch also partially subsumes D16893 from George Burgess IV. George correctly notes that CodeGenPrepare does not actually preserve the dominator tree. I think the dominator tree was usually not valid when CodeGenPrepare ran, but I am using LoopInfo to mark preheaders, so the dominator tree is now always valid before CodeGenPrepare. Author: Tom Jablin (tjablin) Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng http://reviews.llvm.org/D16984 llvm-svn: 265397
* [mips] MIPSR6 Compact jump supportSimon Dardis2016-04-055-75/+152
| | | | | | | | | | | | | | | This patch adds support for compact jumps similiar to the previous compact branch support for MIPSR6. Unlike compact branches, compact jumps do not have a forbidden slot. As MipsInstrInfo::getEquivalentCompactForm can determine the correct expansion for jumps and branches for both microMIPS and MIPSR6, remove the unnecessary distinction in the delay slot filler. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders llvm-svn: 265390
* [NVPTX] Handle ldg created from sign-/zero-extended loadJustin Holewinski2016-04-051-0/+57
| | | | | | | | | | Reviewers: jingyue Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D18053 llvm-svn: 265389
* Don't fold double constant to an integer if dest type not integralTeresa Johnson2016-04-041-0/+12
| | | | | | | | | | | | | | Summary: I encountered this issue when constant folding during inlining tried to fold away a bitcast of a double to an x86_mmx, which is not an integral type. The test case exposes the same issue with a smaller code snippet during early CSE. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18528 llvm-svn: 265367
* test: Always treat .mir files as tests even outside of CodeGen/MIRMatthias Braun2016-04-044-3/+3
| | | | | | | | | We missed a handful of .mir tests that existed outside the test/CodeGen/MIR directory. Also fix the three powerpc .mir tests that nobody noticed were broken. llvm-svn: 265350
* Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵Hans Wennborg2016-04-048-18/+77
| | | | | | | | | | | | | | | | | | | | eliminateCallFramePseudoInstr (PR27140)" The original commit miscompiled things on 32-bit Windows, e.g. a Clang boostrap. It turns out that mergeSPUpdates() was a bit too generous in what it interpreted as a stack adjustment, causing the following code: addl $12, %esp leal -4(%ebp), %esp To be "optimized" into simply: addl $8, %esp This commit tightens up mergeSPUpdates() and includes a new test (test14 in movtopush.ll) for this situation. llvm-svn: 265345
* Beef up some dllexport tests.Sean Silva2016-04-041-1/+23
| | | | | | | | | | | | | | | | Adds some dllexport tests to verify that: - Variables in bss are exported appropriately - Non-dllexport symbols aliased to dllexport symbols are not exported - Symbols declared as dllexport but are not defined are not exported We plan to enable dllimport/dllexport support for the PS4, and these additional tests are for points we noticed in our internal testing. Patch by Warren Ristow! Differential Revision: http://reviews.llvm.org/D18682 llvm-svn: 265333
* ARM, AArch64, X86: Check preserved registers for tail calls.Matthias Braun2016-04-042-0/+46
| | | | | | | | | | | | | | | | We can only perform a tail call to a callee that preserves all the registers that the caller needs to preserve. This situation happens with calling conventions like preserver_mostcc or cxx_fast_tls. It was explicitely handled for fast_tls and failing for preserve_most. This patch generalizes the check to any calling convention. Related to rdar://24207743 Differential Revision: http://reviews.llvm.org/D18680 llvm-svn: 265329
OpenPOWER on IntegriCloud