summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Cleaned up shuffle decode assertion messagesSimon Pilgrim2016-10-011-7/+11
| | | | llvm-svn: 283050
* Fix signed/unsigned warningSimon Pilgrim2016-10-011-2/+2
| | | | llvm-svn: 283041
* [X86][SSE] Add support for combining target shuffles to binary BLENDSimon Pilgrim2016-10-011-4/+30
| | | | | | We already had support for 1-input BLEND with zero - this adds support for 2-input BLEND as well. llvm-svn: 283040
* [X86][SSE] Always combine target shuffles to MOVSD/MOVSSSimon Pilgrim2016-10-013-10/+19
| | | | | | | | Now we can commute to BLENDPD/BLENDPS on SSE41+ targets if necessary, so simplify the combine matching where we can. This required me to add a couple of scalar math movsd/moss fold patterns that hadn't been needed in the past. llvm-svn: 283038
* [X86][SSE] Enable commutation from MOVSD/MOVSS to BLENDPD/BLENDPS on SSE41+ ↵Simon Pilgrim2016-10-012-0/+31
| | | | | | | | | | | | targets Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements. We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets llvm-svn: 283037
* [X86] Cleanup patterns for using VMOVDDUP for broadcasts.Craig Topper2016-10-011-6/+6
| | | | | | | | -Remove OptForSize. Not all of the backend follows the same rules for creating broadcasts and there is no conflicting pattern. -Don't stop selecting VEX VMOVDDUP when AVX512 is supported. We need VLX for EVEX VMOVDDUP. -Only use VMOVDDUP for v2i64 broadcasts if AVX2 is not supported. llvm-svn: 283020
* Revert "Use StringRef instead of raw pointer in TargetRegistry API (NFC)"Mehdi Amini2016-10-011-2/+2
| | | | | | This reverts commit r283017. Creates an infinite loop somehow. llvm-svn: 283019
* Use StringRef instead of raw pointers in MCAsmInfo/MCInstrInfo APIs (NFC)Mehdi Amini2016-10-017-14/+14
| | | | llvm-svn: 283018
* Use StringRef instead of raw pointer in TargetRegistry API (NFC)Mehdi Amini2016-10-011-2/+2
| | | | llvm-svn: 283017
* [AVX-512] Add EVEX versions of VPBROADCASTW patterns with truncated i32 loads.Craig Topper2016-10-012-1/+18
| | | | llvm-svn: 283015
* Use StringRef in Datalayout API (NFC)Mehdi Amini2016-10-012-2/+2
| | | | llvm-svn: 283013
* Revert "Use StringRef in Datalayout API (NFC)"Mehdi Amini2016-10-011-1/+1
| | | | | | This reverts commit r283009. Bots are broken. llvm-svn: 283011
* Use StringRef in Datalayout API (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283009
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-01158-270/+174
| | | | llvm-svn: 283004
* Revert "AMDGPU: Don't use offen if it is 0"Mehdi Amini2016-10-012-100/+14
| | | | | | | This reverts commit r282999. Tests are not passing: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/20038 llvm-svn: 283003
* Remove TargetTriple from AArch64MCInstLower as it's used in few placesEric Christopher2016-10-011-3/+4
| | | | | | and can be pulled from the TargetMachine. NFC. llvm-svn: 283000
* AMDGPU: Don't use offen if it is 0Matt Arsenault2016-10-012-14/+100
| | | | | | This removes many re-initializations of a base register to 0. llvm-svn: 282999
* [AArch64][RegisterBankInfo] Use the helper functions for the checksQuentin Colombet2016-09-301-29/+26
| | | | | | | | This makes sure the helper functions work as expected. NFC. llvm-svn: 282961
* [AArch64][RegisterBankInfo] Rename getValueMappingIdx to getValueMappingQuentin Colombet2016-09-302-9/+11
| | | | | | | | We don't return index, we return the actual ValueMapping. NFC. llvm-svn: 282960
* [AArch64][RegisterBankInfo] Compress the ValueMapping table a bit.Quentin Colombet2016-09-302-46/+38
| | | | | | | | | | We don't need to have singleton ValueMapping on their own, we can just reuse one of the elements of the 3-ops mapping. This allows even more code sharing. NFC. llvm-svn: 282959
* [AArch64][RegisterBankInfo] Refactor the code to access AArch64::ValMappingQuentin Colombet2016-09-302-24/+23
| | | | | | | | | Use a helper function to access ValMapping. This should make the code easier to understand and maintain. NFC. llvm-svn: 282958
* [AArch64][RegisterBankInfo] Rename getRegBankIdx to getRegBankIdxOffsetQuentin Colombet2016-09-302-17/+22
| | | | | | | | | The function name did not make it clear that the returned value was an offset to apply to a register bank index. NFC. llvm-svn: 282957
* [AArch64][RegisterBankInfo] Use the static opds mapping for alt mappingsQuentin Colombet2016-09-301-14/+7
| | | | | | | | Avoid to rely on the dynamically allocated operands mapping for the alternative mapping. NFC. llvm-svn: 282956
* X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)Hans Wennborg2016-09-302-6/+6
| | | | | | | | | | | We can't use Jcc to leave a Win64 function in general, because that confuses the unwinder. However, for "leaf" functions, that is, functions where the return address is always on top of the stack and which don't have unwind info, it's OK. Differential Revision: https://reviews.llvm.org/D24836 llvm-svn: 282920
* [WebAssembly] Make register stackification more conservativeDerek Schuff2016-09-301-19/+15
| | | | | | | | | | | | Register stackification currently checks VNInfo for changes. Make that more accurate by testing each intervening instruction for any other defs to the same virtual register. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D24942 llvm-svn: 282886
* [AMDGPU] Choose VMCNT, EXPCNT, LGKMCNT masks and shifts based on the isa versionKonstantin Zhuravlyov2016-09-305-16/+69
| | | | | | Differential Revision: https://reviews.llvm.org/D24973 llvm-svn: 282877
* [AMDGPU] Ask subtarget if waitcnt instruction is needed before barrier ↵Konstantin Zhuravlyov2016-09-302-2/+9
| | | | | | | | instruction Differential Revision: https://reviews.llvm.org/D24985 llvm-svn: 282875
* [AMDGPU] Do not run scalar optimization passes at "-O0"Konstantin Zhuravlyov2016-09-301-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D25055 llvm-svn: 282873
* [AVR] Add the ELF object file writerDylan McKay2016-09-302-0/+128
| | | | | | | | | | | | Summary: This adds the ELF32 writer for AVR. Reviewers: kparzysz Subscribers: beanz, mgorny Differential Revision: https://reviews.llvm.org/D25031 llvm-svn: 282856
* [AVR] Add the assembly instruction printerDylan McKay2016-09-306-2/+260
| | | | | | | | | | | | | | | | | Summary: This change adds the AVR assembly instruction printer. No tests are included in this patch. I have left them downstream so we can add them once `llc` successfully runs (there's very few components left to upstream until this). Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25028 llvm-svn: 282854
* [AVX-512] Store address operand should be an input operand for the special ↵Craig Topper2016-09-301-4/+4
| | | | | | stack spilling pseudos for XMM16-31 and YMM16-31 without VLX. llvm-svn: 282843
* [AVX-512] Add the special stack spilling pseudos for XMM16-31 and YMM16-31 ↵Craig Topper2016-09-301-0/+8
| | | | | | without VLX to teh isFrameLoadOpcode and isFrameStoreOpcode. llvm-svn: 282842
* Revert r282835 "[AVX-512] Always use the full 32 register vector classes for ↵Craig Topper2016-09-301-15/+30
| | | | | | | | addRegisterClass regardless of whether AVX512/VLX is enabled or not." Turns out this doesn't pass verify-machineinstrs. llvm-svn: 282841
* [X86] Add AVX-512 VTs to findRepresentativeClass as well as v16i16 which was ↵Craig Topper2016-09-301-3/+5
| | | | | | | | also missing. Change register class to include the extra 16 AVX512 registers. I'm not completely sure what this method does or why all the 256-bit VTs returned VR128RegClass when the comments on the method definiton say it should return the largest super register class. I just figured AVX-512 should be similar. llvm-svn: 282836
* [AVX-512] Always use the full 32 register vector classes for ↵Craig Topper2016-09-301-30/+15
| | | | | | | | | | addRegisterClass regardless of whether AVX512/VLX is enabled or not. If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of. I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change. llvm-svn: 282835
* AMDGPU: Use unsigned compare for eq/neMatt Arsenault2016-09-306-19/+19
| | | | | | | | | | For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832
* [X86] Don't preserve Win64 SSE CSRs when SSE is disabledReid Kleckner2016-09-302-2/+9
| | | | | | | | | Code that doesn't use floating point and doesn't use SSE (kernel code) shouldn't save and restore SSE registers. Fixes PR30503 llvm-svn: 282819
* [AArch64][RegisterBankInfo] Use static mapping for 3-operands instrs.Quentin Colombet2016-09-301-0/+50
| | | | | | | | This uses a TableGen'ed like structure for all 3-operands instrs. The output of the RegBankSelect pass should be identical but the RegisterBankInfo will do less dynamic allocations. llvm-svn: 282817
* [AArch64][RegisterBankInfo] Add static value mapping for 3-op instrs.Quentin Colombet2016-09-302-12/+58
| | | | | | | This is the kind of input TableGen should generate at some point. NFC. llvm-svn: 282816
* [AArch64][RegisterBankInfo] Check the statically created ValueMapping.Quentin Colombet2016-09-301-0/+18
| | | | | | | | | Make sure that the ValueMappings contain the value we expect at the indices we expect. NFC. llvm-svn: 282815
* [X86] Avoid "unused" warnings if no assertsDouglas Katzman2016-09-291-2/+4
| | | | llvm-svn: 282732
* [X86][SSE] Added common helper for shuffle mask constant pool decodes.Simon Pilgrim2016-09-291-164/+136
| | | | | | | | | | The shuffle mask decodes have a large amount of repeated code extracting/splitting mask values from Constant data. This patch pulls all of this duplicated code into a single helper function to identify undef elements and combine/split constant integer data into the requested shuffle mask elements. Updated PSHUFB/VPERMIL/VPERMIL2/VPPERM decoders to use it (VPERMV/VPERMV3 could be converted as well in the future). llvm-svn: 282720
* Revert "[AVR] Add instruction selection lowering code"Dylan McKay2016-09-293-1954/+2
| | | | | | I accidentally comitted it. llvm-svn: 282712
* [AVR] Add instruction selection lowering codeDylan McKay2016-09-293-2/+1954
| | | | | | | | | | | | Summary: This adds AVRISelLowering.cpp Reviewers: kparzysz, arsenm Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25034 llvm-svn: 282711
* [AVX-512] Support spills of XMM16-31 and YMM16-31 when VLX isn't available.Craig Topper2016-09-292-8/+137
| | | | | | | | This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used. Fixes one of the cases from PR29112. llvm-svn: 282690
* [AVX-512] Replicate pattern from AVX to select VMOVDDUP for (v2f64 ↵Craig Topper2016-09-291-2/+6
| | | | | | (X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition. llvm-svn: 282688
* [X86] Add EVEX encoded VBROADCASTSS/SD and VPBROADCASTD/Q to execution ↵Craig Topper2016-09-291-0/+10
| | | | | | domain fixing table. llvm-svn: 282687
* [X86] Remove AddedComplexity adjustments that don't seem to be needed.Craig Topper2016-09-291-6/+4
| | | | llvm-svn: 282686
* [X86] Add VBROADCASTF128/VBROADCASTI128 to execution domain fixing tables.Craig Topper2016-09-291-1/+2
| | | | llvm-svn: 282684
* Remove an unnecessary duplicate initialization of TLOF from the MipsEric Christopher2016-09-291-4/+0
| | | | | | | | | | | AsmPrinter. This was reinitializing the Mangler after we moved the Mangler down to TLOF and causing us to have two different unnamed global values accessed with the same name. This should fix the problems on the ubsan tests here: http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/15307 llvm-svn: 282675
OpenPOWER on IntegriCloud