summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Use correct register names in inline assemblyMatt Arsenault2017-06-0814-401/+401
| | | | | | Fixes using physical registers in inline asm from clang. llvm-svn: 305004
* [PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64Guozhi Wei2017-06-082-14/+33
| | | | | | | | | | In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64. This patch fixed PR32442. Differential Revision: https://reviews.llvm.org/D31407 llvm-svn: 305001
* [AMDGPU] Force qsads instrs to use different dest register than source registersMark Searles2017-06-083-37/+68
| | | | | | | | The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes. Differential Revision: https://reviews.llvm.org/D33783 llvm-svn: 304998
* [Power9] Exploit vector integer extend instructionsZaara Syeda2017-06-081-0/+90
| | | | | | | | | | | | | | This patch adds build vector patterns to exploit the vector integer extend instructions: vextsb2w - Vector Extend Sign Byte To Word vextsb2d - Vector Extend Sign Byte To Doubleword vextsh2w - Vector Extend Sign Halfword To Word vextsh2d - Vector Extend Sign Halfword To Doubleword vextsw2d - Vector Extend Sign Word To Doubleword Differential Revision: https://reviews.llvm.org/D33510 llvm-svn: 304992
* [PowerPC] add memcmp test with nobuiltin attr; NFCSanjay Patel2017-06-081-0/+15
| | | | | | | | | | | In SDAG, we don't expand libcalls with a nobuiltin attribute. It's not clear if that's correct from the existing code comment: "Don't do the check if marked as nobuiltin for some reason." ...adding a test here either way to show that there is currently a different behavior implemented in the CGP-based expansion. llvm-svn: 304991
* [x86] remove unused param from tests; NFCSanjay Patel2017-06-081-10/+10
| | | | llvm-svn: 304989
* [CGP / PowerPC] avoid multi-block overhead for simple memcmp expansionSanjay Patel2017-06-081-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The test diff for PowerPC shows we can better optimize if this case is one block. For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this: _cmp_eq8: movq (%rdi), %rax cmpq (%rsi), %rax je LBB23_1 ## BB#2: ## %res_block movl $1, %ecx jmp LBB23_3 LBB23_1: xorl %ecx, %ecx LBB23_3: ## %endblock xorl %eax, %eax testl %ecx, %ecx sete %al retq We get this: cmp_eq8: movq (%rdi), %rcx xorl %eax, %eax cmpq (%rsi), %rcx sete %al retq And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally. Differential Revision: https://reviews.llvm.org/D34005 llvm-svn: 304987
* Add scheduler classes to integer/float horizontal operations.Andrew V. Tischenko2017-06-081-16/+16
| | | | | | | This patch will close PR32801. Differential Revision: https://reviews.llvm.org/D33203 llvm-svn: 304986
* [x86] add tests for memcmp expansion; NFCSanjay Patel2017-06-081-37/+293
| | | | | | | | | | | | We already had a test to demonstrate PR33325: https://bugs.llvm.org/show_bug.cgi?id=33325 I'm adding tests for general memcmp expansion (see D34005 / D33963) and: https://bugs.llvm.org/show_bug.cgi?id=33329 ...plus non-power-of-2 sizes, so we can see what that looks like currently or if expanded. llvm-svn: 304979
* This patch closes PR28513: an optimization of multiplication by different ↵Andrew V. Tischenko2017-06-084-360/+4253
| | | | | | | | constants. The initial patch was rejected: I fixed the issue and re-apply it. llvm-svn: 304972
* [ARM] GlobalISel: Add more tests. NFCDiana Picus2017-06-081-0/+149
| | | | | | | | Add a couple of tests to increase coverage for the TableGen'erated code, in particular for rules where 2 generic instructions may be combined into a single machine instruction. llvm-svn: 304971
* [Hexagon] Generate 'inbounds' GEPs in HexagonCommonGEPKrzysztof Parzyszek2017-06-071-0/+20
| | | | llvm-svn: 304937
* [CGP] avoid zext/trunc of a memcmp expansion compareSanjay Patel2017-06-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | This could be viewed as another shortcoming of the DAGCombiner: when both operands of a compare are zexted from the same source type, we should be able to compare the original types. The effect on PowerPC perf is likely unnoticeable, but there's a visible regression for x86 if we feed the suboptimal IR for memcmp expansion to the DAG: _cmp_eq4_zexted_to_i64: movl (%rdi), %ecx movl (%rsi), %edx xorl %eax, %eax cmpq %rdx, %rcx sete %al _cmp_eq4_better: movl (%rdi), %ecx xorl %eax, %eax cmpl (%rsi), %ecx sete %al llvm-svn: 304923
* [mips][dsp] Modify repl.ph to accept signed immediate valuesPetar Jovanovic2017-06-071-1/+11
| | | | | | | | | | | | Changed immediate type for repl.ph from uimm10 to simm10 as per the specs. Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of GNU as. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D33594 llvm-svn: 304918
* [X86] Add test to demonstrate inefficient lowering of v48i8 shuffle.Guy Blank2017-06-071-0/+49
| | | | llvm-svn: 304915
* AMDGPU/GlobalISel: Mark 32-bit G_SELECT as legalTom Stellard2017-06-071-0/+28
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33949 llvm-svn: 304910
* [x86] avoid flipping sign bits for vector icmp by using known bitsSanjay Patel2017-06-072-104/+34
| | | | | | | | | | | | | | | | If we know that both operands of an unsigned integer vector comparison are non-negative, then it's safe to directly use a signed-compare-greater-than instruction (the only non-equality integer vector compare predicate provided by SSE/AVX). We're intentionally not changing the condition code to signed in order to preserve the existing transforms that use min/max/psubus below here. This should solve PR33276: https://bugs.llvm.org/show_bug.cgi?id=33276 Differential Revision: https://reviews.llvm.org/D33862 llvm-svn: 304909
* [PowerPC] Eliminate integer compare instructions - vol. 5Nemanja Ivanovic2017-06-075-6/+567
| | | | | | | | Adds handling for i64 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33720 llvm-svn: 304907
* [mips] do not use FastISel when -mxgot is presentPetar Jovanovic2017-06-071-0/+3
| | | | | | | | | | | | | | | The clang compiler by default uses FastISel when invoked with -O0, which is also the default. In that case, passing of -mxgot does not get honored, i.e. the code path that is to deal with large got is not taken. Clang produces same output regardless of -mxgot being present or not. This change checks whether -mxgot is passed as an option, and turns off FastISel if it is. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D33593 llvm-svn: 304906
* [ARM] GlobalISel: Purge G_SEQUENCEDiana Picus2017-06-075-64/+46
| | | | | | | | | | | | | | | | | According to the commit message from r296921, G_MERGE_VALUES and G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating G_SEQUENCE in the ARM backend and remove the code dealing with it. This boils down to the code breaking up double values for the soft float calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR + VMOVRRD and simplifies the code in the instruction selector. There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but that is part of the target-independent code for translating constant structs. Therefore, it is beyond the scope of this commit. llvm-svn: 304902
* [PowerPC] Eliminate integer compare instructions - vol. 3Nemanja Ivanovic2017-06-079-39/+798
| | | | | | | | Adds handling for i32 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33718 llvm-svn: 304901
* [ARM] GlobalISel: Support G_XORDiana Picus2017-06-074-0/+168
| | | | | | | | | Same as the other binary operators: - legalize to 32 bits - map to GPRs - select to EORrr via TableGen'erated code llvm-svn: 304898
* evert "[mips] Fix test mips64fpldst.ll with machine verifier enabled"Simon Dardis2017-06-078-19/+41
| | | | | | | This reverts commit r301394. It broke some internal buildbots, reverting while the issue is being investigated. llvm-svn: 304896
* [X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combiningSimon Pilgrim2017-06-071-40/+4
| | | | | | We were checking that the index was in range of the destination vector type, not the (larger) source vector type llvm-svn: 304894
* [ARM] GlobalISel: Support G_ORDiana Picus2017-06-074-0/+168
| | | | | | | | | Same as the other binary operators: - legalize to 32 bits - map to GPRs - select ORRrr thanks to TableGen'erated code llvm-svn: 304890
* [ARM] GlobalISel: Support G_ANDDiana Picus2017-06-074-0/+170
| | | | | | | | | This is identical to the support for the other binary operators: - widen to s32 - map into GPR - select ANDrr (via TableGen'erated code) llvm-svn: 304885
* [CGP / PowerPC] use direct compares if there's only one load per block in ↵Sanjay Patel2017-06-071-20/+14
| | | | | | | | | | | | | | | | | memcmp() expansion I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle. I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time limitation in the DAG. We might have to just avoid those special-cases here in CGP. But regardless of that, I think this is a win for the more general cases. http://rise4fun.com/Alive/cbQ Differential Revision: https://reviews.llvm.org/D33963 llvm-svn: 304849
* [PowerPC] auto-generate full checks and increase test coverageSanjay Patel2017-06-061-77/+160
| | | | | | | 3 of the tests were testing exactly the same thing: memcmp(x, y, 16) != 0. I changed that to test 4, 7, and 16 bytes, so we can see how those differ. llvm-svn: 304838
* Added tests for X86InterleavedStore.Evgeny Stupachenko2017-06-061-0/+93
| | | | | | | | | | Reviewers: RKSimon, DavidKreitzer Differential Revision: https://reviews.llvm.org/D33684 Patch by: Aleen Farhana <Farhana.aleen@gmail.com> llvm-svn: 304834
* llc: Add ability to parse mir from stdinMatthias Braun2017-06-061-0/+20
| | | | | | | | - Add -x <language> option to switch between IR and MIR inputs. - Change MIR parser to read from stdin when filename is '-'. - Add a simple mir roundtrip test. llvm-svn: 304825
* Fix PR23384 (part 3 of 3)Evgeny Stupachenko2017-06-067-67/+67
| | | | | | | | | | | | | Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304824
* MIRPrinter: Avoid assert() when printing empty INLINEASM strings.Matthias Braun2017-06-061-0/+12
| | | | | | | | | | | CodeGen uses MO_ExternalSymbol to represent the inline assembly strings. Empty strings for symbol names appear to be invalid. For now just special case the output code to avoid hitting an `assert()` in `printLLVMNameWithoutPrefix()`. This fixes https://llvm.org/PR33317 llvm-svn: 304815
* [mips] Add madd4 subtarget featurePetar Jovanovic2017-06-061-222/+233
| | | | | | | | | | | Addition of a feature and a predicate used to control generation of madd.fmt and similar instructions. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D33400 llvm-svn: 304801
* [X86][AVX1] Split 256-bit vector non-temporal FastISel loads to keep it ↵Simon Pilgrim2017-06-061-6/+30
| | | | | | | | non-temporal (PR32744) Extension to D33728 llvm-svn: 304798
* AMDGPU/GlobalISel: Mark 32-bit G_ICMP as legalTom Stellard2017-06-061-0/+24
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33890 llvm-svn: 304797
* Vivek Pandya2017-06-0676-1422/+1438
| | | | | | | | | | | | [Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default. If -simplify-mir option is passed then MIRPrinter will not print such fields. This change also required some lit test cases in CodeGen directory to be changed. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D32304 llvm-svn: 304779
* [x86] Stop this test from dirtying the source tree when run.Chandler Carruth2017-06-061-1/+1
| | | | | | The output isn't used anyways. llvm-svn: 304766
* [x86] Add the test for folding stack spills into pextrw.Chandler Carruth2017-06-061-2/+15
| | | | | | | | This is a negative test as pextrw doesn't write to all 32-bits of the spilled GPR. This fold ended up happening when D32684 was landed and covers the regression that motivated reverting it in r304762. llvm-svn: 304763
* [x86] Revert the X86FoldTablesEmitter due to more miscompiles.Chandler Carruth2017-06-063-16/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In testing, we've found yet another miscompile caused by the new tables. And this one is even less clear how to fix (we could teach it to fold a 16-bit load instead of the 32-bit load it wants, or block folding entirely). Also, the approach to excluding instructions seems increasingly to not scale well. I have left a more detailed analysis on the review log for the original patch (https://reviews.llvm.org/D32684) along with suggested path forward. I will land an additional test case that I wrote which covers the code that was miscompiling (folding into the output of `pextrw`) in a subsequent commit to keep this a pure revert. For each commit reverted here, I've restricted the revert to the non-test code touching the x86 fold table emission until the last commit where I did revert the test updates. This means the *new* test cases added for `insertps` and `xchg` remain untouched (and continue to pass). Reverted commits: r304540: [X86] Don't fold into memory operands into insertps in the ... r304347: [TableGen] Adapt more places to getValueAsString now ... r304163: [X86] Don't fold away the memory operand of an xchg. r304123: Don't capture a temporary std::string in a StringRef. r304122: Resubmit "[X86] Adding new LLVM TableGen backend that ..." Original commit was in r304088, and after a string of fixes was reverted previously in r304121 to fix build bots, and then re-landed in r304122. llvm-svn: 304762
* CodeGen: Refactor MIR parsingMatthias Braun2017-06-066-15/+22
| | | | | | | | | | | | When parsing .mir files immediately construct the MachineFunctions and put them into MachineModuleInfo. This allows us to get rid of the delayed construction (and delayed error reporting) through the MachineFunctionInitialzier interface. Differential Revision: https://reviews.llvm.org/D33809 llvm-svn: 304758
* CodeGen/LLVMTargetMachine: Refactor ISel pass construction; NFCIMatthias Braun2017-06-062-3/+3
| | | | | | | | | | | | - Move ISel (and pre-isel) pass construction into TargetPassConfig - Extract AsmPrinter construction into a helper function Putting the ISel code into TargetPassConfig seems a lot more natural and both changes together make make it easier to build custom pipelines involving .mir in an upcoming commit. This moves MachineModuleInfo to an earlier place in the pass pipeline which shouldn't have any effect. llvm-svn: 304754
* [x86] fix over-specific triple; NFCSanjay Patel2017-06-061-18/+18
| | | | | | | | There's nothing darwin-specific in these tests, and using that setting causes extra phantom diffs when the auto-generated check lines are regenerated today. llvm-svn: 304753
* [InlineSpiller] Don't spill fully undef valuesQuentin Colombet2017-06-051-0/+67
| | | | | | | | | | Althought it is not wrong to spill undef values, it is useless and harms both code size and runtime. Before spilling a value, check that its content actually matters. http://www.llvm.org/PR33311 llvm-svn: 304752
* RenameIndependentSubregs: Fix handling of undef tied operandsMatt Arsenault2017-06-051-0/+69
| | | | | | | | If a tied source operand was undef, it would be replaced but not update the other tied operand, which would end up using different virtual registers. llvm-svn: 304747
* [GlobalISel] IRTranslator: Add MachineMemOperand to target memory intrinsicsVolkan Keles2017-06-051-0/+12
| | | | | | | | | | | | Reviewers: qcolombet, ab, t.p.northover, aditya_nandakumar, dsanders Reviewed By: qcolombet Subscribers: rovka, kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33724 llvm-svn: 304743
* [SelectionDAG] Update the dominator after splitting critical edges.Davide Italiano2017-06-051-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running `llc -verify-dom-info` on the attached testcase results in a crash in the verifier, due to a stale dominator tree. i.e. DominatorTree is not up to date! Computed: =============================-------------------------------- Inorder Dominator Tree: [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7} [2] %lor.lhs.false.i61.i.i.i {1,2} [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6} [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5} Actual: =============================-------------------------------- Inorder Dominator Tree: [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9} [2] %lor.lhs.false.i61.i.i.i {1,2} [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8} [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5} [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7} This is because in `SelectionDAGIsel` we split critical edges without updating the corresponding dominator for the function (and we claim in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved). We could either stop preserving the domtree in `getAnalysisUsage` or tell `splitCriticalEdge()` to update it. As the second option is easy to implement, that's the one I chose. Differential Revision: https://reviews.llvm.org/D33800 llvm-svn: 304742
* [X86][SSE41] Non-temporal loads shouldn't be folded if it can be avoided ↵Simon Pilgrim2017-06-051-96/+252
| | | | | | | | | | (PR32743) Missed SSE41 non-temporal load case in previous commit Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304722
* [X86][AVX1] Split 256-bit vector non-temporal loads to keep it non-temporal ↵Simon Pilgrim2017-06-052-100/+202
| | | | | | | | (PR32744) Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304718
* [X86][SSE] Non-temporal loads shouldn't be folded if it can be avoided (PR32743)Simon Pilgrim2017-06-051-65/+148
| | | | | | Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304717
* [ARM] GlobalISel: Constrain callee register on indirect callsDiana Picus2017-06-051-4/+6
| | | | | | | | | | | | | When lowering calls, we generate instructions with machine opcodes rather than generic ones. Therefore, we need to constrain the register classes of the operands. Also enable the machine verifier on the arm-irtranslator.ll test, since that would've caught this issue. Fixes (part of) PR32146. llvm-svn: 304712
OpenPOWER on IntegriCloud