summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [SystemZ] Rework processor feature definitions and add -mcpu=archX supportUlrich Weigand2016-10-317-0/+12
| | | | | | | | | | | | | | | | | | | This patch implements two changes: - Move processor feature definition into a new file SystemZFeatures.td, and provide explicit lists of supported and unsupported features for each level of the z/Architecture. This allows specifying unsupported features in the scheduler definition files for each processor. - Add optional aliases for the -mcpu processor names according to the level of the z/Architecture, for compatibility with other compilers on the platform. The supported aliases are: -mcpu=arch8 equals -mcpu=z10 -mcpu=arch9 equals -mcpu=z196 -mcpu=arch10 equals -mcpu=zEC12 -mcpu=arch11 equals -mcpu=z13 llvm-svn: 285577
* [SystemZ] Correctly diagnose missing features in AsmParserUlrich Weigand2016-10-313-482/+482
| | | | | | | | | | | | | | | | | Currently, when using an instruction that is not supported on the currently selected architecture, the LLVM assembler is likely to diagnose an "invalid operand" instead of a "missing feature". This is because many operands require a custom parser in order to be processed correctly, and if an instruction is not available according to the current feature set, the generated parser code will also not detect the associated custom operand parsers. Fixed by temporarily enabling all features while parsing operands. The missing features will then be correctly detected when actually parsing the instruction itself. llvm-svn: 285575
* [SystemZ] Fix encoding of MVCK and .insn ssUlrich Weigand2016-10-314-17/+16
| | | | | | | | | | | | | | | | | | | | LLVM currently treats the first operand of MVCK as if it were a regular base+index+displacement address. However, it is in fact a base+displacement combined with a length register field. While the two might look syntactically similar, there are two semantic differences: - %r0 is a valid length register, even though it cannot be used as an index register. - In an expression with just a single register like 0(%rX), the register is treated as base with normal addresses, while it is treated as the length register (with an empty base) for MVCK. Fixed by adding a new operand parser class BDRAddr and reworking the assembler parser to distinguish between address + length register operands and regular addresses. llvm-svn: 285574
* Second attempt at r285517.Dorit Nuzman2016-10-316-3/+196
| | | | llvm-svn: 285568
* Improved cost model for FDIV and FSQRT, by Andrew TischenkoAlexey Bataev2016-10-311-82/+82
| | | | | | | | | | There is a bug describing poor cost model for floating point operations: Bug 29083 - [X86][SSE] Improve costs for floating point operations. This patch is the second one in series of patches dealing with cost model. Differential Revision: https://reviews.llvm.org/D25722 llvm-svn: 285564
* Add triple to test so it does not fail on windows.Manuel Klimek2016-10-311-1/+1
| | | | llvm-svn: 285560
* Delete .s file that did not test anything, and check in test that works.Manuel Klimek2016-10-312-20/+27
| | | | | | | In D26098, Davide Italiano submitted a .s file instead of the .ll file that was the last stage of the review. llvm-svn: 285559
* [AVX-512] Add missing patterns for selecting masked vector extracts that ↵Craig Topper2016-10-311-0/+229
| | | | | | started from shuffles. llvm-svn: 285546
* [DAG] x | x --> xSanjay Patel2016-10-301-2/+0
| | | | llvm-svn: 285522
* [DAG] x & x --> xSanjay Patel2016-10-301-2/+0
| | | | llvm-svn: 285521
* [x86] add tests for basic logic op foldsSanjay Patel2016-10-302-0/+37
| | | | llvm-svn: 285520
* Revert r285517 due to build failures.Dorit Nuzman2016-10-306-196/+3
| | | | llvm-svn: 285518
* [LoopVectorize] Make interleaved-accesses analysis less conservative aboutDorit Nuzman2016-10-306-3/+196
| | | | | | | | | | | | | | | | | | | | | possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517
* [ThinLTO] Correctly resolve linkonce when importing aliaseeTeresa Johnson2016-10-305-20/+99
| | | | | | | | | | | | | | | | | | | Summary: When we have an aliasee that is linkonce, while we can't convert the non-prevailing copies to available_externally, we still need to convert the prevailing copy to weak. If a reference to the aliasee is exported, not converting a copy to weak will result in undefined references when the linkonce is removed in its original module. Add a new test and update existing tests. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26076 llvm-svn: 285512
* [ValueTracking] recognize more variants of smin/smaxSanjay Patel2016-10-293-39/+15
| | | | | | | | | | | | | Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding smax pattern and commuted variants. The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR. Differential Revision: https://reviews.llvm.org/D26091 llvm-svn: 285499
* [x86] add tests for smin/smax matchSelPattern (D26091)Sanjay Patel2016-10-292-59/+127
| | | | llvm-svn: 285498
* [InstCombine] re-use bitcasted compare operands in selects (PR28001)Sanjay Patel2016-10-291-9/+6
| | | | | | | | | | | These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495
* [DAGCombiner] (REAPPLIED) Add vector demanded elements support to ↵Simon Pilgrim2016-10-292-43/+13
| | | | | | | | | | | | | | | | | | | | computeKnownBits Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used. I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course. DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit. This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander! Differential Revision: https://reviews.llvm.org/D25691 llvm-svn: 285494
* Fixed FMA + FNEG combine.Elena Demikhovsky2016-10-291-0/+106
| | | | | | | | Masked form of FMA should be omitted in this optimization. Differential Revision: https://reviews.llvm.org/D25984 llvm-svn: 285492
* AMDGPU: Use 1/2pi inline imm on VIMatt Arsenault2016-10-292-10/+70
| | | | | | I'm guessing at how it is supposed to be printed llvm-svn: 285490
* Do not print out Flags field twice.Rui Ueyama2016-10-281-10/+26
| | | | llvm-svn: 285481
* [DAGCombiner] Fix a crash visiting `AND` nodes.Davide Italiano2016-10-281-0/+20
| | | | | | | | | | Instead of asserting that the shift count is != 0 we just bail out as it's not profitable trying to optimize a node which will be removed anyway. Differential Revision: https://reviews.llvm.org/D26098 llvm-svn: 285480
* AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructionsTom Stellard2016-10-281-0/+59
| | | | | | | | | | | | | | Summary: Flat instruction can return out of order, so we need always need to wait for all the outstanding flat operations. Reviewers: tony-tye, arsenm Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D25998 llvm-svn: 285479
* Add missing lit.local.cfg to llvm/test/Transforms/CodeGenPrepare/NVPTX.Justin Lebar2016-10-281-0/+2
| | | | llvm-svn: 285464
* AMDGPU: Add definitions for scalar store instructionsMatt Arsenault2016-10-284-12/+36
| | | | | | | | | | Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463
* [NVPTX] Compute 'rem' using the result of 'div', if possible.Justin Lebar2016-10-281-0/+112
| | | | | | | | | | | | | | | | | | | | | Summary: In isel, transform Num % Den into Num - (Num / Den) * Den if the result of Num / Den is already available. Reviewers: tra Subscribers: hfinkel, llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26090 llvm-svn: 285461
* Don't leave unused divs/rems sitting around in BypassSlowDivision.Justin Lebar2016-10-281-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This "pass" eagerly creates div and rem instructions even when only one is needed -- it relies on a later pass (machine DCE?) to clean them up. This is problematic not just from a cleanliness perspective (this pass is running during CodeGenPrepare, so should leave the IR in a better state), but it also creates a problem for instruction selection. If we always have a div+rem, isel will always select a divrem instruction (if possible), even when a single div or rem would do. Specifically, in NVPTX, we want to compute rem from the output of div, if available. But if a div is not available, we want to leave the rem alone. This transformation is overeager if div is always available. Because this code runs as part of CodeGenPrepare, it's nontrivial to write a test for this change. But this will effectively be tested by a later patch which adds the aforementioned change to NVPTX isel. Reviewers: tra Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26088 llvm-svn: 285460
* Don't claim the udiv created in BypassSlowDivision is exact.Justin Lebar2016-10-281-0/+16
| | | | | | | | | | | | | | | | | | | Summary: In BypassSlowDivision's short-dividend path, we would create e.g. udiv exact i32 %a, %b "exact" here means that we are asserting that %a is a multiple of %b. But we have no reason to believe this must be true -- this is just a bug, as far as I can tell. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D26097 llvm-svn: 285459
* AMDGPU: Change check prefix in testMatt Arsenault2016-10-281-219/+219
| | | | llvm-svn: 285449
* AMDGPU: Diagnose using too many SGPRsMatt Arsenault2016-10-281-0/+102
| | | | | | This is possible when using inline asm. llvm-svn: 285447
* Handle non-~0 lane masks on live-in registers in LivePhysRegsKrzysztof Parzyszek2016-10-281-0/+55
| | | | | | | | | | | | | | | | | When LivePhysRegs adds live-in registers, it recognizes ~0 as a special lane mask indicating the entire register. If the lane mask is not ~0, it will only add the subregisters that overlap the specified lane mask. The problem is that if a live-in register does not have subregisters, and the lane mask is not ~0, it will not be added to the live set. (The given lane mask may simply be the lane mask of its register class.) If a register does not have subregisters, add it to the live set if the lane mask is non-zero. Differential Revision: https://reviews.llvm.org/D26094 llvm-svn: 285440
* SpeculativeExecution: Allow speculating more inst typesMatt Arsenault2016-10-284-0/+318
| | | | | | | Partial step towards removing the whitelist and only using TTI's cost. llvm-svn: 285438
* AMDGPU: Fix using incorrect private resource with no allocationMatt Arsenault2016-10-286-9/+139
| | | | | | | | | | | It's possible to have a use of the private resource descriptor or scratch wave offset registers even though there are no allocated stack objects. This would result in continuing to use the maximum number reserved registers. This could go over the number of SGPRs available on VI, or violate the SGPR limit requested by the function attributes. llvm-svn: 285435
* Implement vector count leading/trailing bytes with zero lsb and vector parityNemanja Ivanovic2016-10-281-0/+55
| | | | | | | | | builtins - llvm portion This patch corresponds to review https://reviews.llvm.org/D26003. Committing on behalf of Zaara Syeda. llvm-svn: 285434
* Make swift calling convention test specific to armv7Arnold Schwaighofer2016-10-281-67/+65
| | | | llvm-svn: 285431
* [x86] add tests for missed umin/umaxSanjay Patel2016-10-281-0/+59
| | | | | | | This is actually a deficiency in ValueTracking's matchSelectPattern(), but a codegen test is the simplest way to expose the bug. llvm-svn: 285429
* More swift calling convention testsArnold Schwaighofer2016-10-286-5/+1099
| | | | llvm-svn: 285417
* [InstCombine] move/add tests for smin/smax foldsSanjay Patel2016-10-282-25/+79
| | | | llvm-svn: 285414
* [Hexagon] Maintain kill flags through splitting in expand-condsetsKrzysztof Parzyszek2016-10-281-0/+78
| | | | | | | Do not use LiveIntervals to recalculate kills, because that cannot be done accurately without implicit uses on predicated instructions. llvm-svn: 285409
* [Loads] Fix crash in is isDereferenceableAndAlignedPointer()Tom Stellard2016-10-281-0/+21
| | | | | | | | | | | | | | | Summary: We were trying to add APInt values with different bit sizes after visiting an addrspacecast instruction which changed the bit width of the pointer. Reviewers: majnemer, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24774 llvm-svn: 285407
* [LV] Correct misleading comments in test (NFC)Matthew Simpson2016-10-281-9/+5
| | | | llvm-svn: 285402
* Revert "[DAGCombiner] Add vector demanded elements support to computeKnownBits"Juergen Ributzka2016-10-282-13/+43
| | | | | | | This seems to have increased LTO compile time bejond 2x of previous builds. See http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/ llvm-svn: 285381
* [Reassociate] Removing instructions mutates the IR.Davide Italiano2016-10-281-0/+16
| | | | | | | | | Fixes PR 30784. Discussed with Justin, who pointed out that in the new PassManager infrastructure we can have more fine-grained control on which analyses we want to preserve, but this is the best we can do with the current infrastructure. llvm-svn: 285380
* [ConstantFold] Get the correct vector type when folding a getelementptr.Davide Italiano2016-10-281-0/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D26014 llvm-svn: 285371
* AMDGPU/SI: Handle hazard with s_rfe_b64Tom Stellard2016-10-271-0/+31
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25638 llvm-svn: 285368
* AMDGPU/SI: Handle hazard with sgpr lane selects for v_{read,write}laneTom Stellard2016-10-271-0/+66
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25637 llvm-svn: 285367
* Remove accidentally commited test.Davide Italiano2016-10-271-15/+0
| | | | llvm-svn: 285366
* [IR] Reintroduce getGEPReturnType(), it will be used in a later patch.Davide Italiano2016-10-271-0/+15
| | | | llvm-svn: 285365
* Reverting back r285355: "Update .debug_line section version information to ↵Ekaterina Romanova2016-10-2713-70/+31
| | | | | | match DWARF version", while I'm investigating a test failure. llvm-svn: 285362
* [Coverage] Darwin: Move __llvm_covmap from __DATA to __LLVM_COVVedant Kumar2016-10-271-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Programs with very large __llvm_covmap sections may fail to link on Darwin because because of out-of-range 32-bit RIP relative references. It isn't possible to work around this by using the large code model because it isn't supported on Darwin. One solution is to move the __llvm_covmap section past the end of the __DATA segment. === Testing === In addition to check-{llvm,clang,profile}, I performed a link test on a simple object after injecting ~4GB of padding into __llvm_covmap: @__llvm_coverage_padding = internal constant [4000000000 x i8] zeroinitializer, section "__LLVM_COV,__llvm_covmap", align 8 (This test is too expensive to check-in.) === Backwards Compatibility === This patch should not pose any backwards-compatibility concerns. LLVM is expected to scan all of the sections in a binary for __llvm_covmap, so changing its segment shouldn't affect anything. I double-checked this by loading coverage produced by an unpatched compiler with a patched llvm-cov. Suggested by Nick Kledzik. llvm-svn: 285360
OpenPOWER on IntegriCloud