summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [mips][ias] Range check uimmz operands.Daniel Sanders2015-11-062-2/+34
| | | | | | | | | | Reviewers: vkalintiris Subscribers: dsanders, atanasyan, llvm-commits Differential Revision: http://reviews.llvm.org/D14013 llvm-svn: 252294
* [mips] Define patterns for the atomic_{load,store}_{8,16,32,64} nodes.Vasileios Kalintiris2015-11-063-4/+26
| | | | | | | | | | | | | | | | Summary: Without these patterns we would generate a complete LL/SC sequence. This would be problematic for memory regions marked as WRITE-only or READ-only, as the instructions LL/SC would read/write to the protected memory regions correspondingly. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14397 llvm-svn: 252293
* AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNELTom Stellard2015-11-066-0/+60
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291
* [WinEH] Split EH_RESTORE out of CATCHRET for 32-bit EHReid Kleckner2015-11-066-52/+90
| | | | | | | | | | | | | | | | | | | | | | | This adds the EH_RESTORE x86 pseudo instr, which is responsible for restoring the stack pointers: EBP and ESP, and ESI if stack realignment is involved. We only need this on 32-bit x86, because on x64 the runtime restores CSRs for us. Previously we had to keep the CATCHRET instruction around during SEH so that we could convince X86FrameLowering to restore our frame pointers. Now we can split these instructions earlier. This was confusing, because we had a return instruction which wasn't really a return and was ultimately going to be removed by X86FrameLowering. This change also simplifies X86FrameLowering, which really shouldn't be building new MBBs. No observable functional change currently, but with the new register mask stuff in D14407, CATCHRET will become a register allocator barrier, and our existing tests rely on us having reasonable register allocation around SEH. llvm-svn: 252266
* Remove windows line endings introduced by r252177. NFC.Tim Northover2015-11-059-128/+128
| | | | llvm-svn: 252217
* [WinEH] Fix funclet prologues with stack realignmentReid Kleckner2015-11-053-34/+53
| | | | | | | | | | | | | | We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210
* [WebAssembly] Fix copypasta.Dan Gohman2015-11-052-3/+3
| | | | | | Noticed by dschff in http://reviews.llvm.org/rL252203 llvm-svn: 252208
* [WebAssembly] Rename Immediate instructions to Const.Dan Gohman2015-11-052-16/+16
| | | | | | This more closely reflects the naming convention in the spec. llvm-svn: 252204
* [WebAssembly] Add AsmString strings for most instructions.Dan Gohman2015-11-057-135/+212
| | | | | | | | | Mangling type information into MachineInstr opcode names was a temporary measure, and it's starting to get hairy. At the same time, the MC instruction printer wants to use AsmString strings for printing. This patch takes the first step, starting the process of adding AsmStrings for instructions. llvm-svn: 252203
* [WebAssembly] Update wasm builtin functions to match spec changes.Dan Gohman2015-11-051-15/+7
| | | | | | | The page_size operator has been removed from the spec, and the resize_memory operator has been changed to grow_memory. llvm-svn: 252202
* replace MachineCombinerPattern namespace and enum with enum class; NFCISanjay Patel2015-11-054-36/+36
| | | | | | | | Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196
* [WebAssembly] Add WebAssemblyMCInstLower.cpp.Dan Gohman2015-11-055-4/+168
| | | | | | | This isn't used yet; it's just a start towards eventually using MC to do instruction printing, and eventually binary encoding. llvm-svn: 252194
* [DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268.Oleg Ranevskyy2015-11-059-116/+131
| | | | | | | | | | | | | | | | | | | Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177
* Add cfi instr for CFA calculation when movpc is expanded to call and popPetar Jovanovic2015-11-051-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the issue of wrong CFA calculation in the following case: 0x08048400 <+0>: push %ebx 0x08048401 <+1>: sub $0x8,%esp 0x08048404 <+4>: **call 0x8048409 <test+9>** 0x08048409 <+9>: **pop %eax** 0x0804840a <+10>: add $0x1bf7,%eax 0x08048410 <+16>: mov %eax,%ebx 0x08048412 <+18>: call 0x80483f0 <bar> 0x08048417 <+23>: add $0x8,%esp 0x0804841a <+26>: pop %ebx 0x0804841b <+27>: ret The highlighted instructions are a product of movpc instruction. The call instruction changes the stack pointer, and pop instruction restores its value. However, the rule for computing CFA is not updated and is wrong on the pop instruction. So, e.g. backtrace in gdb does not work when on the pop instruction. This adds cfi instructions for both call and pop instructions. cfi_adjust_cfa_offset** instruction is used with the appropriate offset for setting the rules to calculate CFA correctly. Patch by Violeta Vukobrat. Differential Revision: http://reviews.llvm.org/D14021 llvm-svn: 252176
* [WebAssembly] Rename ior operator to or to match the specDerek Schuff2015-11-051-1/+1
| | | | | | | | | | | | Summary: The spec uses "or" for inclusive-or and "xor" for exclusive-or Reviewers: sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D14362 llvm-svn: 252174
* [ARM] Compute known bits for ARMISD::CMOVJames Molloy2015-11-051-0/+10
| | | | | | | | | | We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities. I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :( This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case. llvm-svn: 252168
* revert rev. 252153 due to build failure on ubuntuAsaf Badouh2015-11-054-101/+1
| | | | | | [X86][AVX512] add comi with Sae llvm-svn: 252154
* [X86][AVX512] add comi with SaeAsaf Badouh2015-11-054-1/+101
| | | | | | | | add builtin_ia32_vcomisd and builtin_ia32_vcomisd Differential Revision: http://reviews.llvm.org/D14331 llvm-svn: 252153
* [X86][AVX512] small bugfix in VPBROADCASTMAsaf Badouh2015-11-051-2/+2
| | | | | | | | VPBROADCASTMW2D and VPBROADCASTMB2Q Differential Revision: http://reviews.llvm.org/D14335 llvm-svn: 252151
* AMDGPU: Also track whether SGPRs were spilledMatt Arsenault2015-11-053-2/+20
| | | | llvm-svn: 252145
* AMDGPU: Print number user SGPRsMatt Arsenault2015-11-051-0/+6
| | | | | | | This doesn't quite match how SC prints it, which doesn't put it in a comment. llvm-svn: 252144
* AMDGPU: Disallow s[102:103] on VI in assemblerMatt Arsenault2015-11-051-2/+28
| | | | llvm-svn: 252142
* AMDGPU: Fix assert when legalizing atomic operandsMatt Arsenault2015-11-053-15/+59
| | | | | | | | | | The operand layout is slightly different for the atomic opcodes from the usual MUBUF loads and stores. This should only fix it on SI/CI. VI is still broken because it still emits the addr64 replacement. llvm-svn: 252140
* AMDGPU: Make addr64 atomic operand order consistentMatt Arsenault2015-11-051-2/+2
| | | | | | | vaddr comes before srsrc in every other MUBUF instruction, and is the order it is printed. llvm-svn: 252139
* [WinEH] Fix establisher param reg in CLR funcletsJoseph Tremoulet2015-11-051-9/+20
| | | | | | | | | | | | | | Summary: The CLR's personality routine passes the pointer to the establisher frame in RCX, not RDX. Reviewers: pgavlin, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14343 llvm-svn: 252135
* Go back to producing relocations for out of range symbols.Rafael Espindola2015-11-051-6/+4
| | | | | | | | This brings back the behavior from before r252090 for out of range symbols. Should bring some arm bots back. llvm-svn: 252119
* AMDGPU: Fix typoMatt Arsenault2015-11-051-2/+2
| | | | llvm-svn: 252116
* Slightly saner handling of thumb branches.Rafael Espindola2015-11-041-9/+15
| | | | | | | | The generic infrastructure already did a lot of work to decide if the fixup value is know or not. It doesn't make sense to reimplement a very basic case: same fragment. llvm-svn: 252090
* [x86] Teach the shrink-wrapping hooks to do the proper thing with Win64.Quentin Colombet2015-11-041-0/+8
| | | | | | | | | | Win64 has some strict requirements for the epilogue. As a result, we disable shrink-wrapping for Win64 unless the block that gets the epilogue is already an exit block. Fixes PR24193. llvm-svn: 252088
* Warning fix.Simon Pilgrim2015-11-041-2/+2
| | | | llvm-svn: 252078
* [X86][SSE] Add general memory folding for (V)INSERTPS instructionSimon Pilgrim2015-11-043-58/+79
| | | | | | | | | | | | | | | | This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction. The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening. This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done. It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted. Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll Differential Revision: http://reviews.llvm.org/D13988 llvm-svn: 252074
* [IR] Add bounds checking to paramHasAttrSanjoy Das2015-11-041-4/+6
| | | | | | | | | | | | | | | | | Summary: This is intended to make a later change simpler. Note: adding this bounds checking required fixing `X86FastISel`. As far I can tell I've preserved original behavior but a careful review will be appreciated. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14304 llvm-svn: 252073
* Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of ↵Andrew Kaylor2015-11-042-38/+118
| | | | | | | | | | | | | | scalar FMA intrinsics. Patch by Slava Klochkov The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result. Reviewers: Quentin Colombet, Elena Demikhovsky Differential Revision: http://reviews.llvm.org/D13710 llvm-svn: 252060
* [ARM] Combine CMOV into BFI where possibleJames Molloy2015-11-042-0/+106
| | | | | | | | | | | | | | If we have a CMOV, OR and AND combination such as: if (x & CN) y |= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252057
* [ELF] elfiamcu triple should imply e_machine == EM_IAMCUMichael Kuperstein2015-11-042-4/+22
| | | | | | Differential Revision: http://reviews.llvm.org/D14109 llvm-svn: 252043
* [X86] DAGCombine should not introduce FILD in soft-float modeMichael Kuperstein2015-11-041-2/+2
| | | | | | | The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode. llvm-svn: 252042
* CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.Peter Collingbourne2015-11-032-20/+4
| | | | | | | | | | | | | | | | | A profile of an LTO link of Chrome revealed that we were spending some ~30-50% of execution time in the function Constant::getRelocationInfo(), which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn from TargetMachine::getNameWithPrefix(). It turns out that we only need the result of getKindForGlobal() when targeting Mach-O, so this change moves the relevant part of the logic to TargetLoweringObjectFileMachO. NFCI. Differential Revision: http://reviews.llvm.org/D14168 llvm-svn: 252014
* AMDGPU: Make flat_scratch name consistentMatt Arsenault2015-11-031-3/+3
| | | | | | | The printed name and the parsed assembler names weren't the same. I'm not sure which name SC prints these as, but I think it's this one. llvm-svn: 252010
* AMDGPU: Fix asserts on invalid register rangesMatt Arsenault2015-11-031-5/+13
| | | | | | | | | If the requested SGPR was not actually aligned, it was accepted and rounded down instead of rejected. Also fix an assert if the range is an invalid size. llvm-svn: 252009
* AMDGPU: Fix off by one error in register parsingMatt Arsenault2015-11-031-4/+5
| | | | | | If trying to use one past the end, this would assert. llvm-svn: 252008
* Align whitespaceDerek Schuff2015-11-032-4/+4
| | | | llvm-svn: 252003
* [WebAssembly] Support wasm select operatorDerek Schuff2015-11-032-0/+10
| | | | | | | | | | | | | | Summary: Add support for wasm's select operator, and lower LLVM's select DAG node to it. Reviewers: sunfish Subscribers: dschuff, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D14295 llvm-svn: 252002
* AMDGPU: s[102:103] is unavailable on VIMatt Arsenault2015-11-031-1/+10
| | | | llvm-svn: 252000
* AMDGPU: Define correct number of SGPRsMatt Arsenault2015-11-032-6/+10
| | | | | | | | | There are actually 104 so 2 were missing. More assembler tests with high register number tuples will be included in later patches. llvm-svn: 251999
* AMDGPU: Make findUsedSGPR more readableMatt Arsenault2015-11-031-7/+18
| | | | | | Add more comments etc. llvm-svn: 251996
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-033-8/+15
| | | | llvm-svn: 251995
* AMDGPU: Alphabetize includesMatt Arsenault2015-11-031-1/+1
| | | | llvm-svn: 251994
* [X86][XOP] Add support for the matching of the VPCMOV bit select instructionSimon Pilgrim2015-11-031-0/+10
| | | | | | | | | | XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) ) This patch adds tablegen pattern matching for this instruction. Differential Revision: http://reviews.llvm.org/D8841 llvm-svn: 251975
* [X86] Generate .cfi_adjust_cfa_offset correctly when pushing argumentsMichael Kuperstein2015-11-033-27/+64
| | | | | | | | | | | | | | | When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904
* AVX512: add encoding tests for vmovq/d instructions.Igor Breger2015-11-031-1/+1
| | | | llvm-svn: 251903
OpenPOWER on IntegriCloud