summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Fix assert when legalizing atomic operandsMatt Arsenault2015-11-053-15/+59
| | | | | | | | | | The operand layout is slightly different for the atomic opcodes from the usual MUBUF loads and stores. This should only fix it on SI/CI. VI is still broken because it still emits the addr64 replacement. llvm-svn: 252140
* AMDGPU: Make addr64 atomic operand order consistentMatt Arsenault2015-11-051-2/+2
| | | | | | | vaddr comes before srsrc in every other MUBUF instruction, and is the order it is printed. llvm-svn: 252139
* [WinEH] Fix establisher param reg in CLR funcletsJoseph Tremoulet2015-11-051-9/+20
| | | | | | | | | | | | | | Summary: The CLR's personality routine passes the pointer to the establisher frame in RCX, not RDX. Reviewers: pgavlin, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14343 llvm-svn: 252135
* Go back to producing relocations for out of range symbols.Rafael Espindola2015-11-051-6/+4
| | | | | | | | This brings back the behavior from before r252090 for out of range symbols. Should bring some arm bots back. llvm-svn: 252119
* AMDGPU: Fix typoMatt Arsenault2015-11-051-2/+2
| | | | llvm-svn: 252116
* Slightly saner handling of thumb branches.Rafael Espindola2015-11-041-9/+15
| | | | | | | | The generic infrastructure already did a lot of work to decide if the fixup value is know or not. It doesn't make sense to reimplement a very basic case: same fragment. llvm-svn: 252090
* [x86] Teach the shrink-wrapping hooks to do the proper thing with Win64.Quentin Colombet2015-11-041-0/+8
| | | | | | | | | | Win64 has some strict requirements for the epilogue. As a result, we disable shrink-wrapping for Win64 unless the block that gets the epilogue is already an exit block. Fixes PR24193. llvm-svn: 252088
* Warning fix.Simon Pilgrim2015-11-041-2/+2
| | | | llvm-svn: 252078
* [X86][SSE] Add general memory folding for (V)INSERTPS instructionSimon Pilgrim2015-11-043-58/+79
| | | | | | | | | | | | | | | | This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction. The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening. This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done. It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted. Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll Differential Revision: http://reviews.llvm.org/D13988 llvm-svn: 252074
* [IR] Add bounds checking to paramHasAttrSanjoy Das2015-11-041-4/+6
| | | | | | | | | | | | | | | | | Summary: This is intended to make a later change simpler. Note: adding this bounds checking required fixing `X86FastISel`. As far I can tell I've preserved original behavior but a careful review will be appreciated. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14304 llvm-svn: 252073
* Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of ↵Andrew Kaylor2015-11-042-38/+118
| | | | | | | | | | | | | | scalar FMA intrinsics. Patch by Slava Klochkov The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result. Reviewers: Quentin Colombet, Elena Demikhovsky Differential Revision: http://reviews.llvm.org/D13710 llvm-svn: 252060
* [ARM] Combine CMOV into BFI where possibleJames Molloy2015-11-042-0/+106
| | | | | | | | | | | | | | If we have a CMOV, OR and AND combination such as: if (x & CN) y |= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252057
* [ELF] elfiamcu triple should imply e_machine == EM_IAMCUMichael Kuperstein2015-11-042-4/+22
| | | | | | Differential Revision: http://reviews.llvm.org/D14109 llvm-svn: 252043
* [X86] DAGCombine should not introduce FILD in soft-float modeMichael Kuperstein2015-11-041-2/+2
| | | | | | | The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode. llvm-svn: 252042
* CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.Peter Collingbourne2015-11-032-20/+4
| | | | | | | | | | | | | | | | | A profile of an LTO link of Chrome revealed that we were spending some ~30-50% of execution time in the function Constant::getRelocationInfo(), which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn from TargetMachine::getNameWithPrefix(). It turns out that we only need the result of getKindForGlobal() when targeting Mach-O, so this change moves the relevant part of the logic to TargetLoweringObjectFileMachO. NFCI. Differential Revision: http://reviews.llvm.org/D14168 llvm-svn: 252014
* AMDGPU: Make flat_scratch name consistentMatt Arsenault2015-11-031-3/+3
| | | | | | | The printed name and the parsed assembler names weren't the same. I'm not sure which name SC prints these as, but I think it's this one. llvm-svn: 252010
* AMDGPU: Fix asserts on invalid register rangesMatt Arsenault2015-11-031-5/+13
| | | | | | | | | If the requested SGPR was not actually aligned, it was accepted and rounded down instead of rejected. Also fix an assert if the range is an invalid size. llvm-svn: 252009
* AMDGPU: Fix off by one error in register parsingMatt Arsenault2015-11-031-4/+5
| | | | | | If trying to use one past the end, this would assert. llvm-svn: 252008
* Align whitespaceDerek Schuff2015-11-032-4/+4
| | | | llvm-svn: 252003
* [WebAssembly] Support wasm select operatorDerek Schuff2015-11-032-0/+10
| | | | | | | | | | | | | | Summary: Add support for wasm's select operator, and lower LLVM's select DAG node to it. Reviewers: sunfish Subscribers: dschuff, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D14295 llvm-svn: 252002
* AMDGPU: s[102:103] is unavailable on VIMatt Arsenault2015-11-031-1/+10
| | | | llvm-svn: 252000
* AMDGPU: Define correct number of SGPRsMatt Arsenault2015-11-032-6/+10
| | | | | | | | | There are actually 104 so 2 were missing. More assembler tests with high register number tuples will be included in later patches. llvm-svn: 251999
* AMDGPU: Make findUsedSGPR more readableMatt Arsenault2015-11-031-7/+18
| | | | | | Add more comments etc. llvm-svn: 251996
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-033-8/+15
| | | | llvm-svn: 251995
* AMDGPU: Alphabetize includesMatt Arsenault2015-11-031-1/+1
| | | | llvm-svn: 251994
* [X86][XOP] Add support for the matching of the VPCMOV bit select instructionSimon Pilgrim2015-11-031-0/+10
| | | | | | | | | | XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) ) This patch adds tablegen pattern matching for this instruction. Differential Revision: http://reviews.llvm.org/D8841 llvm-svn: 251975
* [X86] Generate .cfi_adjust_cfa_offset correctly when pushing argumentsMichael Kuperstein2015-11-033-27/+64
| | | | | | | | | | | | | | | When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904
* AVX512: add encoding tests for vmovq/d instructions.Igor Breger2015-11-031-1/+1
| | | | llvm-svn: 251903
* Fix build problme introduced in r251883Matthias Braun2015-11-031-1/+1
| | | | llvm-svn: 251888
* ScheduleDAGInstrs: Remove IsPostRA flag; NFCMatthias Braun2015-11-031-2/+1
| | | | | | | | | | | | | | | | | ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883
* [Hexagon] Fixing mistaken case fallthrough.Colin LeMahieu2015-11-031-0/+1
| | | | llvm-svn: 251867
* AMDGPU: Stop assuming vreg for build_vectorMatt Arsenault2015-11-022-20/+40
| | | | | | | | | | | | | This was causing a variety of test failures when v2i64 is added as a legal type. SIFixSGPRCopies should correctly handle the case of vector inputs to a scalar reg_sequence, so this isn't necessary anymore. This was hiding some deficiencies in how reg_sequence is handled later, but this shouldn't be a problem anymore since the register class copy of a reg_sequence is now done before the reg_sequence. llvm-svn: 251860
* [WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinterDerek Schuff2015-11-021-1/+1
| | | | llvm-svn: 251859
* AMDGPU: Error on graphics shaders with HSAMatt Arsenault2015-11-021-0/+8
| | | | | | | | I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858
* AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCEMatt Arsenault2015-11-021-23/+89
| | | | | | | Make the REG_SEQUENCE be a VGPR, and do the register class copy first. llvm-svn: 251855
* [PPC64LE] Properly initialize instr-info in PPCVSXSwapRemoval passBill Schmidt2015-11-021-1/+1
| | | | | | | | Replace some hacky code with the proper way to get at this data. No functional change. llvm-svn: 251848
* WatchOS: update default CPU for triple after t2dsp -> dsp renameTim Northover2015-11-021-2/+2
| | | | llvm-svn: 251814
* Fix for bootstrap bug introduced in r244921Nemanja Ivanovic2015-11-022-3/+2
| | | | | | | | | | This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. It turns out that the new code path taken due to legalizing a scalar_to_vector of i64 -> v2i64 exposes a missing check in a micro optimization to change a load followed by a scalar_to_vector into a load and splat instruction on PPC. llvm-svn: 251798
* AVX512: Implemented encoding and intrinsics for VBROADCASTI32x2 and ↵Igor Breger2015-11-023-0/+54
| | | | | | | | VBROADCASTF32x2 instructions. Differential Revision: http://reviews.llvm.org/D14216 llvm-svn: 251781
* [X86] Remove assertions that check for valid scale values on scatter/gather ↵Craig Topper2015-11-021-8/+0
| | | | | | intrinsics. Nothing upstream prevented illegal values from getting here. llvm-svn: 251780
* [X86] Fold 'if' followed by just an llvm_unreachable into an assert.Craig Topper2015-11-021-8/+7
| | | | llvm-svn: 251778
* [X86] Use isa instead of dyn_cast in a bool context. NFCCraig Topper2015-11-021-2/+2
| | | | llvm-svn: 251777
* [X86] Remove some llvm_unreachables after switches that already have an ↵Craig Topper2015-11-021-5/+3
| | | | | | unreachable in their default case. llvm-svn: 251776
* [X86] Remove a 'break' after an llvm_unreachable.Craig Topper2015-11-021-3/+1
| | | | llvm-svn: 251775
* [X86] Use cast instead of dyn_cast and a null check marked unreachable.Craig Topper2015-11-021-8/+3
| | | | llvm-svn: 251774
* [X86] Use MVT instead of EVT when the type is known to be simple. NFCCraig Topper2015-11-022-89/+80
| | | | llvm-svn: 251772
* Untabify.NAKAMURA Takumi2015-11-023-7/+7
| | | | llvm-svn: 251769
* AVX-512: Optimized SIMD truncate operations for AVX512F set.Elena Demikhovsky2015-11-012-3/+23
| | | | | | | | | | | | Optimized <8 x i32> to <8 x i16> <4 x i64> to < 4 x i32> <16 x i16> to <16 x i8> All these oprtrations use now AVX512F set (KNL). Before this change it was implemented with AVX2 set. Differential Revision: http://reviews.llvm.org/D14108 llvm-svn: 251764
* [X86] Replace getScalarType with getVectorElementType when the type is ↵Craig Topper2015-10-311-29/+29
| | | | | | already known to be a vector. This should result in slightly less code. NFC llvm-svn: 251751
* [X86] Convert to MVT instead of calling EVT functions since we already know ↵Craig Topper2015-10-311-2/+2
| | | | | | the type is simple. NFC llvm-svn: 251745
OpenPOWER on IntegriCloud