summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [Thumb1] Move padding earlier when synthesizing TBBs off of the PCJames Molloy2016-11-071-0/+8
| | | | | | | | | | When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding. If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset. In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4. llvm-svn: 286107
* [AVR] Enable the ISel, frame analyzer, and alloca passesDylan McKay2016-11-071-2/+8
| | | | llvm-svn: 286095
* [AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to ↵Craig Topper2016-11-071-72/+0
| | | | | | | | selects and native zext/sext. This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select. llvm-svn: 286092
* [AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to ↵Craig Topper2016-11-071-4/+0
| | | | | | legacy intrinsics and a select. llvm-svn: 286089
* Reapply r286080 with a phony change in Hexagon's CMakeLists.txtKrzysztof Parzyszek2016-11-063-208/+107
| | | | | | | | Cmake has not recognized that Hexagon.td has a new dependency in HexagonPatterns.td. All changes to that file were not visible to the build bots. llvm-svn: 286084
* ARM: lower fpowi appropriately for Windows ARMSaleem Abdulrasool2016-11-061-0/+57
| | | | | | | | | | | This handles the last case of the builtin function calls that we would generate code which differed from Microsoft's ABI. Rather than generating a call to `__pow{d,s}i2` we now promote the parameter to a float or double and invoke `powf` or `pow` instead. Addresses PR30825! llvm-svn: 286082
* Revert r286080: it breaks build botsKrzysztof Parzyszek2016-11-062-97/+208
| | | | llvm-svn: 286081
* [Hexagon] Remove redundant custom selection codeKrzysztof Parzyszek2016-11-062-208/+97
| | | | | | | | | | | | | | | | The clr/set/toggle-bit instructions (with the bit index given as an immediate operand) had both, custom selection code that generated them, and selection patterns at the same time. The selection patterns were not used, because the custom selection code was executed first. This patch removes the custom code in favor of the selection patterns. The custom code handled 64-bit registers as well with an immediate bit index, and so new patterns were added to implement that. It was also the same case for the instruction "Rd += asr(Rs, Rt)", except that the custom code did not offer any additional functionality, and was simply removed. llvm-svn: 286080
* [Hexagon] Round 5 of selection pattern simplificationsKrzysztof Parzyszek2016-11-061-85/+53
| | | | | | Remove unnecessary type casts in patterns. llvm-svn: 286079
* [Hexagon] Round 4 of selection pattern simplificationsKrzysztof Parzyszek2016-11-062-101/+81
| | | | | | Give simpler or more meaningful names to pat frags and xforms. llvm-svn: 286078
* [Hexagon] Round 3 of selection pattern simplificationsKrzysztof Parzyszek2016-11-064-171/+76
| | | | | | | Remove unnecessary C++ functions for SDNode transforms. Move more pat frags to files where they are used. llvm-svn: 286077
* [Hexagon] Round 2 of selection pattern simplificationsKrzysztof Parzyszek2016-11-061-27/+29
| | | | | | Add pat frags for any-, sign-, and zero-extensions. llvm-svn: 286076
* [AVX-512] Remove intrinsics for 128/256-bit masked variable shift. Instead ↵Craig Topper2016-11-061-10/+0
| | | | | | upgrade them to a select and the older AVX2 intrinsic. llvm-svn: 286073
* [AVX-512] Remove intrinsics for 128/256-bit masked shift by immediate. ↵Craig Topper2016-11-061-16/+0
| | | | | | Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286072
* [AVX-512] Remove intrinsics for 128/256-bit masked shift by single element ↵Craig Topper2016-11-061-16/+0
| | | | | | in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286070
* [X86][SSE] Reuse zeroable element mask in ↵Simon Pilgrim2016-11-061-16/+16
| | | | | | | | lowerVectorShuffleAsElementInsertion. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286067
* [AVX-512] Add missing EVEX version of pattern for (v2f64 (extloadv2f32 ↵Craig Topper2016-11-062-1/+3
| | | | | | addr:)) -> VCVTPS2PDZ128rm llvm-svn: 286059
* [AVX-512] Lower AVX cvtpd2ps intrinsic to ISD::FP_ROUND so it can use EVEX ↵Craig Topper2016-11-063-12/+15
| | | | | | instruction when available. llvm-svn: 286057
* [AVX-512] Lower SSE/AVX cvtdq2ps intrinsics directly to ISD::SINT_TO_FP so ↵Craig Topper2016-11-062-18/+2
| | | | | | they can use EVEX instructions when available. llvm-svn: 286056
* [Hexagon] Relocate pattern-related bits to proper placesKrzysztof Parzyszek2016-11-056-57/+51
| | | | llvm-svn: 286049
* [Hexagon] Round 1 of selection pattern simplificationsKrzysztof Parzyszek2016-11-051-267/+267
| | | | | | | Consistently use register class pat frags instead of spelling out the type and class each time. llvm-svn: 286048
* [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBlend. NFCISimon Pilgrim2016-11-051-20/+24
| | | | | | Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286045
* [X86][SSE] Reuse zeroable element mask in ↵Simon Pilgrim2016-11-051-21/+18
| | | | | | | | lowerVectorShuffleAsZeroOrAnyExtend. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286044
* [X86][SSE] Reuse zeroable element mask in SSE4A EXTRQ/INSERTQ vector shuffle ↵Simon Pilgrim2016-11-051-5/+6
| | | | | | | | lowering. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286043
* [X86][SSE] Reuse zeroable element mask in PSHUFB vector shuffle lowering. NFCISimon Pilgrim2016-11-051-14/+13
| | | | | | Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286042
* [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsInsertPS. NFCISimon Pilgrim2016-11-051-3/+5
| | | | | | Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286040
* [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBitMask. NFCISimon Pilgrim2016-11-051-9/+11
| | | | | | Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286039
* [X86][SSE] Reuse zeroable element mask instead of regenerating it. NFCISimon Pilgrim2016-11-051-30/+47
| | | | | | | | We are repeatedly calling computeZeroableShuffleElements in many shuffle lowering calls for the same shuffle mask/inputs. This is a first step towards reusing the zeroable result, initially just for lowerVectorShuffleAsShift calls. llvm-svn: 286037
* [Hexagon] Split all selection patterns into a separate fileKrzysztof Parzyszek2016-11-058-3297/+3359
| | | | | | | This is just the basic separation, without any cleanup. Further changes will follow. llvm-svn: 286036
* Strip trailing whitespace. NFCI.Simon Pilgrim2016-11-051-4/+4
| | | | llvm-svn: 286034
* [Hexagon] Account for <def,read-undef> when validating moves for predicationKrzysztof Parzyszek2016-11-041-0/+7
| | | | llvm-svn: 286009
* [X86] Broadcast from memory intructions aren't unfoldableZvi Rackover2016-11-041-8/+8
| | | | | | | | Broadcast from memory instructions should be treated as moves. They can't be unfolded. Fixes pr30693. llvm-svn: 285998
* Revert "AMDGPU: Add VI i16 support"Tom Stellard2016-11-0415-409/+78
| | | | | | This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995
* X86: Move a non-null assert to before the pointer is dereferencedJustin Bogner2016-11-031-1/+2
| | | | llvm-svn: 285975
* Sink all of the code relying on the MachO MachineModuleInfo to liveChandler Carruth2016-11-031-47/+51
| | | | | | | | | | | | | | | behind the test that the MachineModuleInfo analysis was actually available and can be used. While the MachO bits may well be reasonable to assume in the darwin assembly printer, the analysis isn't constructively guaranteed anywhere I could find so it seems safest to avoid crashing here. This issue was found with PVS-Studio. Pretty sure the Clang Static Anaylzer flags similar issues but we've probably never pointed it at this code effectively. llvm-svn: 285972
* [Cortex-M0] Atomic loweringWeiming Zhao2016-11-032-7/+14
| | | | | | | | | | | | Summary: ARMv6m supports dmb etc fench instructions but not ldrex/strex etc. So for some atomic load/store, LLVM should inline instructions instead of lowering to __sync_ calls. Reviewers: rengolin, efriedma, t.p.northover, jmolloy Subscribers: efriedma, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26120 llvm-svn: 285969
* NFC - Test commit.Tony Jiang2016-11-031-1/+0
| | | | | | Delete an empty line at the end of README.txt file. llvm-svn: 285964
* AMDGPU/SI: Re add VIInstructions.td to unbreak botsTom Stellard2016-11-031-0/+14
| | | | | | | | This file is unused as of r285939, but we need to keep it around for bots that don't do full rebuilds. We should be able to delete this again in a few days. llvm-svn: 285948
* Remove a redundant condition found by PVS-Studio.Chandler Carruth2016-11-031-2/+2
| | | | | | | Filed http://llvm.org/PR30897 to teach Clang to warn on this kind of stuff. llvm-svn: 285945
* AMDGPU: Add VI i16 supportTom Stellard2016-11-0315-88/+405
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939
* Delete a dead store found by PVS-Studio.Chandler Carruth2016-11-031-1/+0
| | | | | | | | Quite sad we still aren't really using aggressive dead code warnings from Clang that we could potentially use to catch this and so many other things. llvm-svn: 285936
* [AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.Alexander Timofeev2016-11-031-5/+19
| | | | | | | | | | | | | | | | | hange explores the fact that LDS reads may be reordered even if access the same location. Prior the change, algorithm immediately stops as soon as any memory access encountered between loads that are expected to be merged together. Although, Read-After-Read conflict cannot affect execution correctness. Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%. Also improvement expected on any massive sequences of reads from LDS. Differential Revision: https://reviews.llvm.org/D25944 llvm-svn: 285919
* Refactor creation of X86ISD::SETCC nodes to a helper function. NFC.Zvi Rackover2016-11-031-70/+41
| | | | llvm-svn: 285917
* Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently"James Molloy2016-11-032-138/+5
| | | | | | This reverts commit r285893. It caused (probably) http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/83 . llvm-svn: 285912
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-11-032-5/+138
| | | | | | | | | | | | | | | This recommits r281323, which was backed out for two reasons. One, a selfhost failure, and two, it apparently caused Chromium failures. Actually, the latter was a red herring. The log has expired from the former, but I suspect that was a red herring too (actually caused by another problematic patch of mine). Therefore reapplying, and will watch the bots like a hawk. For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 285893
* [AVX-512] Use 'vnot' instead of 'not' in patterns involving vXi1 vectors.Craig Topper2016-11-031-62/+28
| | | | | | | | | | | | This fixes selection of KANDN instructions and allows us to remove an extra set of patterns for KNOT and KXNOR. Reviewers: delena, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26134 llvm-svn: 285878
* Expandload and Compressstore intrinsicsElena Demikhovsky2016-11-032-22/+69
| | | | | | | | 2 new intrinsics covering AVX-512 compress/expand functionality. This implementation includes syntax, DAG builder, operation lowering and tests. Does not include: handling of illegal data types, codegen prepare pass and the cost model. llvm-svn: 285876
* [Hexagon] Remove registers coalesced in expand-condsets from live intervalsKrzysztof Parzyszek2016-11-021-0/+3
| | | | llvm-svn: 285846
* AMDGPU: Allow additional implicit operands on MOVRELS instructionsNicolai Haehnle2016-11-021-1/+4
| | | | | | | | | | | | | | | | | | | Summary: The post-RA scheduler occasionally uses additional implicit operands when the vector implicit operand as a whole is killed, but some subregisters are still live because they are directly referenced later. Unfortunately, this seems incredibly subtle to reproduce. Fixes piglit spec/glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-wr.shader_test and others. Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25656 llvm-svn: 285835
* Fix Clang-tidy readability-redundant-string-cstr warningsMalcolm Parsons2016-11-023-4/+3
| | | | | | | | | | Reviewers: beanz, lattner, jlebar Subscribers: jholewinski, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D26235 llvm-svn: 285832
OpenPOWER on IntegriCloud