summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative.Craig Topper2016-07-181-0/+24
| | | | llvm-svn: 275769
* [X86] Add floating point packed logical ops to ↵Craig Topper2016-07-181-0/+36
| | | | | | X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275768
* [X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative.Craig Topper2016-07-181-0/+50
| | | | llvm-svn: 275767
* [X86] Add more AVX512 instructions to X86InstrInfo::isHighLatencyDef. Also ↵Craig Topper2016-07-181-14/+247
| | | | | | add all packed fp division instructions. llvm-svn: 275766
* [X86] Add AVX512 load opcodes and a couple AVX load opcodes to ↵Craig Topper2016-07-181-0/+80
| | | | | | X86InstrInfo::areLoadsFromSameBasePtr. llvm-svn: 275765
* [X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly ↵Craig Topper2016-07-181-0/+80
| | | | | | AVX-512 related. llvm-svn: 275764
* [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when ↵Craig Topper2016-07-181-6/+15
| | | | | | | | VLX is supported. Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step. llvm-svn: 275763
* [X86] Fix 80-column violations. NFCCraig Topper2016-07-181-8/+16
| | | | llvm-svn: 275762
* Strip trailing whitespaceSimon Pilgrim2016-07-171-6/+6
| | | | llvm-svn: 275726
* [X86][SSE] lowerVectorShuffleAsPermuteAndUnpack tidyup. NFCI.Simon Pilgrim2016-07-171-10/+7
| | | | | | | | Moved unpack type determination into TryUnpack lambda. Added missing comment describing lowerVectorShuffleAsPermuteAndUnpack call. llvm-svn: 275708
* test commitGuy Blank2016-07-171-1/+1
| | | | llvm-svn: 275703
* Disable this-return argument forwarding on ARM/AArch64Hal Finkel2016-07-162-2/+16
| | | | | | | | | | | r275042 reverted function-attribute inference for the 'returned' attribute because the feature triggered self-hosting failures on ARM and AArch64. James Molloy determined that the this-return argument forwarding feature, which directly ties the returned input argument to the returned value, was the cause. It seems likely that this forwarding code contains, or triggers, a subtle bug. Disabling for now until we can track that down. llvm-svn: 275677
* Re-commit [AMDGPU] Add metadata for runtimeYaxun Liu2016-07-163-0/+371
| | | | | | Attempting to fix lit test failure on ppc. llvm-svn: 275676
* [AVX512] Remove CodeGenOnly VBROADCAST m_Int instructions. They can be ↵Craig Topper2016-07-161-28/+47
| | | | | | implemented with patterns selecting existing instructions. NFC llvm-svn: 275671
* ARM: Initialize LoadStore passes in TargetMachineMatthias Braun2016-07-163-16/+13
| | | | | | | | | | | | | | Initializing them in LLVMInitializeARMTarget() makes them visible early enough for "llc -run-pass usage". This required the pass to be renamed from "arm-load-store-opt" to "arm-ldst-opt", because there already exists an arm-load-store-opt cl::opt switch which would now clash with the passname getting added as a switch in opt. On the bright side the pass name now matches the DEBUG_TYPE name. Renamed "arm-prera-load-store-opt" to "arm-repra-ldst-opt" as well for consistency. llvm-svn: 275661
* Reapply "Mips: Avoid implicit iterator conversions, NFC"Duncan P. N. Exon Smith2016-07-156-57/+51
| | | | | | | | | | This reverts commit r275562, effectively reapplying r275141. Doug Gilmore reported that there was an error when bisecting the Mips buildbot failure, and that r275141 was not to blame after all. Here is the green build: https://dmz-portal.mips.com/bb/builders/LLVM%20with%20integrated%20assembler%20and%20fPIC%20and%20-O0/builds/803 llvm-svn: 275643
* Minor code cleanups. NFC.Junmo Park2016-07-151-2/+2
| | | | llvm-svn: 275637
* [lanai] Small cleanup: remove/comment out unused argsJacques Pienaar2016-07-1524-94/+97
| | | | llvm-svn: 275636
* AMDGPU: Fix verifier error from partially undef copyMatt Arsenault2016-07-151-5/+3
| | | | | | | | | | | | | | In this situation: %VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11, %VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use> %VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3 %VGPR4<def> = COPY %VGPR2 The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1, but VGPR4 is defined immediately after this copy. llvm-svn: 275635
* BPF: Use official ELF e_machine valueAlexei Starovoitov2016-07-151-1/+1
| | | | | | | | | The same value for EM_BPF is being propagated to glibc, elfutils, and binutils. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 275633
* [lanai] Fix build by updating calls to getLoad & getStore.Jacques Pienaar2016-07-151-9/+7
| | | | | | rL275592 removed the boolean parameters of SelectionDAG::getLoad and getStore, updating Lanai backend's calls to these functions. llvm-svn: 275631
* [Hexagon] Handle instruction latency for 0 or 2 cyclesKrzysztof Parzyszek2016-07-155-0/+227
| | | | | | | | | | | | | | | | | | | | | | | The Hexagon schedulers need to handle instructions with a latency of 0 or 2 more accurately. The problem, in v60, is that a dependence between two instructions with a 2 cycle latency can use a .cur version of the source to achieve a 0 cycle latency when the use is in the same packet. Any othe use, must be at least 2 packets later, or a stall occurs. In other words, the compiler does not want to schedule the dependent instructions 1 cycle later. To achieve this, the latency adjustment code allows only a single dependence to have a zero latency. All other instructions have the other value, which is typically 2 cycles. We use a heuristic to determine which instruction gets the 0 latency. The Hexagon machine scheduler was also changed to increase the cost associated with 0 latency dependences than can be scheduled in the same packet. Patch by Brendon Cahoon. llvm-svn: 275625
* AMDGPU: Remove brev intrinsicMatt Arsenault2016-07-152-6/+0
| | | | llvm-svn: 275620
* AMDGPU: Fix TargetPrefix for remaining r600 intrinsicsMatt Arsenault2016-07-153-51/+53
| | | | llvm-svn: 275619
* AMDGPU: Remove AMDGPU.ldexpMatt Arsenault2016-07-151-4/+0
| | | | llvm-svn: 275618
* AMDGPU: Remove legacy rsq.clamped intrinsicMatt Arsenault2016-07-154-15/+7
| | | | | | | | Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining. Also fix mismatch with non-IEEE rsq selecting to IEEE rsq. llvm-svn: 275617
* AMDGPU/R600: Delete dead code.Matt Arsenault2016-07-152-58/+1
| | | | | | Dead or the same as the base implementation. llvm-svn: 275616
* Teach fast isel about the win64 calling convention.Nico Weber2016-07-151-4/+2
| | | | | | | | | | | | | | | This mostly just works. Vectorcall rets are still not supported. The win64_eh test change is because fast isel doesn't use rsi for temporary computations, so it doesn't need to be pushed. The test case I'm changing was originally added to test pushes, but by now there are other test cases in that file exercising that code path. https://reviews.llvm.org/D22422 llvm-svn: 275607
* [Hexagon] Make MI scheduler check for stalls in previous packet on v60Krzysztof Parzyszek2016-07-152-3/+41
| | | | | | Patch by Ikhlas Ajbar. llvm-svn: 275606
* [PowerPC] Set kill flag for scratch register when spilling the link registerNemanja Ivanovic2016-07-151-1/+1
| | | | | | This fixes PR 28526. llvm-svn: 275603
* Fix calls to SelectionDAG::getStoreDerek Schuff2016-07-151-2/+2
| | | | | | It was refactored in r275592. NFC llvm-svn: 275601
* Revert "[AMDGPU] Add metadata for runtime"Vitaly Buka2016-07-153-371/+0
| | | | | | This reverts commit r275566. llvm-svn: 275599
* [Hexagon] Replace postprocessDAG with a more elaborate DAG mutationKrzysztof Parzyszek2016-07-151-10/+76
| | | | llvm-svn: 275598
* [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, ↵Justin Lebar2016-07-1520-1035/+730
| | | | | | | | | | | | | | | | | | | | | | | getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592
* [CodeGen] Take a MachineMemOperand::Flags in ↵Justin Lebar2016-07-1511-20/+23
| | | | | | | | | | | | | | | | | MachineFunction::getMachineMemOperand. Summary: Previously we took an unsigned. Hooray for type-safety. Reviewers: chandlerc Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D22282 llvm-svn: 275591
* [Hexagon] Add a scheduling DAG mutationKrzysztof Parzyszek2016-07-154-1/+68
| | | | | | | - Remove output dependencies on USR_OVF register. - Update chain edge latencies between v60 vector loads/stores. llvm-svn: 275586
* [Hexagon] Update instruction itinerariesKrzysztof Parzyszek2016-07-157-120/+128
| | | | llvm-svn: 275578
* [Hexagon] Fixes/changes to instruction selectionKrzysztof Parzyszek2016-07-152-27/+72
| | | | | | | | - Add patterns for rr/abs addressing modes. - Set addrMode to PostInc where necessary. - Misc fixes. llvm-svn: 275574
* [Hexagon] Improve patterns with stack-based addressingKrzysztof Parzyszek2016-07-155-352/+394
| | | | | | | | | - Treat bitwise OR with a frame index as an ADD wherever possible, fold it into addressing mode. - Extend patterns for memops to allow memops with frame indexes as address operands. llvm-svn: 275569
* [AMDGPU] Add metadata for runtimeYaxun Liu2016-07-153-0/+371
| | | | | | | | | | Added emitting metadata to elf for runtime. Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream. Differential Revision: https://reviews.llvm.org/D21849 llvm-svn: 275566
* Rename AnalyzeBranch* to analyzeBranch*.Jacques Pienaar2016-07-1543-93/+94
| | | | | | | | | | | | Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564
* Revert r275141 - Mips: Avoid implicit iterator conversions, NFCDaniel Sanders2016-07-156-51/+57
| | | | | | It appears to have caused some failures in our buildbots. llvm-svn: 275562
* [X86][AVX2] Improve lowerShuffleAsRepeatedMaskAndLanePermute permutation of ↵Simon Pilgrim2016-07-151-34/+75
| | | | | | | | | | | | 64-bit sub-lanes As discussed on PR28136, lowerShuffleAsRepeatedMaskAndLanePermute was attempting to match repeated masks at the 128-bit level and then permute the resultant lanes at the 128-bit (AVX1) or 64-bit (AVX2) sub-lane level. This change allows us to create the repeated masks at the sub-lane level (and then concat them together to create a 128-bit repeated mask) and then select which sub-lane to permute. This has no effect on the AVX1 codegen. Fixes PR28136. llvm-svn: 275543
* [ARM] Fix build after r275540James Molloy2016-07-151-6/+6
| | | | | | A rebase seemed so innocent before committing. Turns out someone changed a pointer to a reference in the mean time :( llvm-svn: 275541
* [Thumb-1] Select post-increment load and store where possibleJames Molloy2016-07-153-4/+83
| | | | | | | | | | Thumb-1 doesn't have post-inc or pre-inc load or store instructions. However the LDM/STM instructions with writeback can function as post-inc load/store: ldm r0!, {r1} @ load from r0 into r1 and increment r0 by 4 Obviously, this only works if the post increment is 4. llvm-svn: 275540
* [ARM] Followup to r275537 addressing review commentsJames Molloy2016-07-151-2/+2
| | | | | | Address Chad's comment in D22216 which I missed due to tunnel vision on the "LGTM" comment. llvm-svn: 275538
* [ARM] Prefer indirect calls in minsize modeJames Molloy2016-07-151-28/+42
| | | | | | | | ... When we emit several calls to the same function in the same basic block. An indirect call uses a "BLX r0" instruction which has a 16-bit encoding. If many calls are made to the same target, this can enable significant code size reductions. llvm-svn: 275537
* AMDGPU: Fix not expanding control flow after some kill blocksMatt Arsenault2016-07-151-7/+2
| | | | | | | | | | | | | Also stop trying to insert skip blocks at end_cf. This was inserting them at the end of the block which doesn't make sense. The skip should be inserted at the beginning of the block right after the end cf. Just remove this for now since no tests seem to stress this and I think this can be handled more generally later. Fixes bug 28550 llvm-svn: 275510
* AMDGPU: Fix trying to skip from a block with no successorsMatt Arsenault2016-07-151-2/+3
| | | | | | Found while reducing bug 28550 llvm-svn: 275509
* AMDGPU: Fix splitting kill blocks with defs before killMatt Arsenault2016-07-151-13/+3
| | | | llvm-svn: 275508
OpenPOWER on IntegriCloud