summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "[Hexagon] Recognize polynomial-modulo loop idiom again"Vitaly Buka2017-03-211-700/+17
| | | | | | | | Fix memory leaks on check-llvm tests detected by Asan. This reverts commit r298282. llvm-svn: 298329
* [ARM] Revert r297443 and r297820.Eli Friedman2017-03-213-167/+31
| | | | | | | | | | | | The glueless lowering of addc/adde in Thumb1 has known serious miscompiles (see https://reviews.llvm.org/D31081), and r297820 causes an infinite loop for certain constructs. It's not clear when they will be fixed, so let's just take them out of the tree for now. (I resolved a small conflict with r297453.) llvm-svn: 298328
* [ARM] Fix PR32130: Handle promotion of zero sized constants.Vadzim Dambrouski2017-03-201-1/+2
| | | | | | | | | | | The special case of zero sized values was previously not handled correctly. This patch handles this by not promoting if the size is zero. Patch by Tim Neumann. Differential Revision: https://reviews.llvm.org/D31116 llvm-svn: 298320
* [Fuchsia] Use %gs for ABI slots under -mcmodel=kernelEvgeniy Stepanov2017-03-201-2/+2
| | | | | | | | | | | Make x86_64-fuchsia targets under -mcmodel=kernel use %gs rather than %fs to access ABI slots for stack-protector and safe-stack Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D30870 llvm-svn: 298302
* [Hexagon] Recognize polynomial-modulo loop idiom againKrzysztof Parzyszek2017-03-201-17/+700
| | | | | | | | | Regain the ability to recognize loops calculating polynomial modulo operation. This ability has been lost due to some changes in the preceding optimizations. Add code to preprocess the IR to a form that the pattern matching code can recognize. llvm-svn: 298282
* [AMDGPU] Run always inliner early in optKonstantin Zhuravlyov2017-03-201-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D31141 llvm-svn: 298281
* [AMDGPU][MC] Fix for Bugs 28201, 28199, 28170 + LIT testsDmitry Preobrazhensky2017-03-201-8/+34
| | | | | | | | | | This fix enables sp3 abs modifier with constants Reviewers: artem.tamazov Differential Revision: https://reviews.llvm.org/D30825 llvm-svn: 298265
* [Outliner] ACTUALLY remove the errs outputJessica Paquette2017-03-201-1/+1
| | | | | | | I don't know how to type. This fixes the last commit which would have made all of the overflows legal, and kept the screaming. llvm-svn: 298263
* [Outliner] Remove output for offset range checkJessica Paquette2017-03-201-3/+1
| | | | | | | | | | Forgot to remove some output before committing last time. (Instruction fixups don't actually overflow anywhere in the test suite so far, so I missed it). To prevent the outliner from screaming "Overflow!" in the event that that does happen, this commit removes that output. llvm-svn: 298260
* [AMDGPU][MC] Fix for Bugs 28200, 28202 + LIT testsDmitry Preobrazhensky2017-03-202-25/+108
| | | | | | | | | | Fixed several related issues with VOP3 fp modifiers. Reviewers: artem.tamazov Differential Revision: https://reviews.llvm.org/D30821 llvm-svn: 298255
* [GlobalISel] Use the correct calling conv for callsDiana Picus2017-03-204-10/+8
| | | | | | | | | | | This commit adds a parameter that lets us pass in the calling convention of the call to CallLowering::lowerCall. This allows us to handle situations where the calling convetion of the callee is different from that of the caller. Differential Revision: https://reviews.llvm.org/D31039 llvm-svn: 298254
* Revert "[AMDGPU] Run always inliner early in opt"Konstantin Zhuravlyov2017-03-201-1/+0
| | | | | | This reverts commit r297958, it breaks device-libs build. llvm-svn: 298239
* [AVX-512] Handle kor/kand/kandn/kxor/kxnor/knot intrinsics at lowering time ↵Craig Topper2017-03-193-26/+40
| | | | | | | | | | | | | | | | | | | instead of isel Summary: Currently we handle these intrinsics at isel with special patterns. But as they just map to normal logic operations, we should just handle them at lowering. This will expose them to DAG combine optimizations. Right now the kor-sequence test generates a bunch of regclass copies between GR16 and VK16 that the peephole optimizer and/or register coallescing are removing to keep everything in the mask domain. By handling the logic op intrinsics earlier, these copies become bitcasts in the DAG and get removed by DAG combine which seems more robust. This should help enable my plan to stop copying between K registers and GR8/GR16. The peephole optimizer can't remove a chain of copies between K and GR32 with insert_subreg/extract_subreg present in the chain so the kor-sequence test break. But this patch should dodge the problem entirely. Reviewers: zvi, delena, RKSimon, igorb Reviewed By: igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31056 llvm-svn: 298228
* Fix MSVC warning: "switch statement contains 'default' but no 'case' ↵Simon Pilgrim2017-03-191-4/+1
| | | | | | labels". NFCI. llvm-svn: 298225
* [MIR] Support Customed Register Mask and CSRsOren Ben Simhon2017-03-191-2/+0
| | | | | | | | | | | | | The MIR printer dumps a string that describe the register mask of a function. A static predefined list of register masks matches a static list of strings. However when the register mask is not from the static predefined list, there is no descriptor string and the printer fails. This patch adds support to custom register mask printing and dumping. Also the list of callee saved registers (describing the registers that must be preserved for the caller) might be dynamic. As such this data needs to be dumped and parsed back to the Machine Register Info. Differential Revision: https://reviews.llvm.org/D30971 llvm-svn: 298207
* ExecutionDepsFix: Let targets specialize the pass; NFCMatthias Braun2017-03-182-3/+38
| | | | | | | | Let targets specialize the pass with the register class so we can get a parameterless default constructor and can put the pass into the pass registry to enable testing with -run-pass=. llvm-svn: 298184
* ExecutionDepsFix: Normalize names; NFCMatthias Braun2017-03-186-9/+9
| | | | | | | Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and ExecutionDepsFix to the last one. llvm-svn: 298183
* Make library calls sensitive to regparm module flag (Fixes PR3997).Nirav Dave2017-03-1813-30/+69
| | | | | | | | | | Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179
* Capitalize ArgListEntry fields. NFC.Nirav Dave2017-03-188-38/+41
| | | | llvm-svn: 298178
* [AMDGPU] Add address space based alias analysis passStanislav Mekhanoshin2017-03-175-0/+223
| | | | | | | | | This is direct port of HSAILAliasAnalysis pass, just cleaned for style and renamed. Differential Revision: https://reviews.llvm.org/D31103 llvm-svn: 298172
* [Outliner] Add outliner for AArch64Jessica Paquette2017-03-172-11/+303
| | | | | | | | | | | | | | | | | This commit adds the necessary target hooks for outlining in AArch64. It also refactors the switch statement used in `getMemOpBaseRegImmOfsWidth` into a more general function, `getMemOpInfo`. This allows the outliner to share that code without copying and pasting it. The AArch64 outliner can be run using -mllvm -enable-machine-outliner, as with the X86-64 outliner. The test for this pass verifies that the outliner does, in fact outline functions, fixes up the stack accesses properly, and can correctly generate a tail call. In the future, this test should be replaced with a MIR test, so that we can properly test immediate offset overflows in fixed-up instructions. llvm-svn: 298162
* AMDGPU: Fix broken condition in hazard recognizerMatt Arsenault2017-03-173-17/+25
| | | | | | Fixes bug 32248. llvm-svn: 298125
* AMDGPU: Fix handling of constant phi input loop conditionsMatt Arsenault2017-03-171-5/+8
| | | | | | | | If the loop condition was an i1 phi with a constantexpr input, this would add a loop intrinsic fed by a phi dependent on a call to if.break in the same block. Insert the call in the loop header. llvm-svn: 298121
* AMDGPU: Cleanup control flow intrinsicsMatt Arsenault2017-03-1710-106/+80
| | | | | | | | | | | | | | | | Move backend internal intrinsics along with the rest of the normal intrinsics, and use the Intrinsic::getDeclaration API instead of manually constructing the type list. It's surprising this was working before. fdiv.fast had the wrong number of parameters. The control flow intrinsic declaration attributes were not being applied, and their types were inconsistent. The actual IR use types did not match the declaration, and were closer to the types used for the patterns. The brcond lowering was changing the types, so introduce new nodes for those. llvm-svn: 298119
* [x86] clean up setcc with negated operand transform and add missing test; NFCISanjay Patel2017-03-171-14/+15
| | | | llvm-svn: 298118
* [X86] Emit fewer instructions to allocate >16GB stack framesReid Kleckner2017-03-171-37/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Use this code pattern when RAX is live, instead of emitting up to 2 billion adjustments: pushq %rax movabsq +-$Offset+-8, %rax addq %rsp, %rax xchg %rax, (%rsp) movq (%rsp), %rsp Try to clean this code up a bit while I'm here. In particular, hoist the logic that handles the entire adjustment with `movabsq $imm, %rax` out of the loop. This negates the offset in the prologue and uses ADD because X86 only has a two operand subtract which always subtracts from the destination register, which can no longer be RSP. Fixes PR31962 Reviewers: majnemer, sdardis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30052 llvm-svn: 298116
* [x86] avoid adc/sbb assert when both sides of add are zexted (PR32316)Sanjay Patel2017-03-171-2/+6
| | | | | | | | | | | | | | As noted in the comment, we might want to account for this case, but I didn't look at what that would mean for the asm. I'm also not sure why this only reproduces with avx512, but I'm putting a conservative fix in for now to avoid the crash. Also, if both sides of an add are zexted, shouldn't we shrink that add? https://bugs.llvm.org/show_bug.cgi?id=32316 llvm-svn: 298107
* Fix wasm build after arg_begin iterator type changeReid Kleckner2017-03-171-1/+1
| | | | llvm-svn: 298106
* Only unswitch loops with uniform conditionsStanislav Mekhanoshin2017-03-171-0/+2
| | | | | | | | | | | | | | | | | | Loop unswitching can be extremely harmful for a SIMT target. In case if hoisted condition is not uniform a SIMT machine will execute both clones of a loop sequentially. Therefor LoopUnswitch checks if the condition is non-divergent. Since DivergenceAnalysis adds an expensive PostDominatorTree analysis not needed for non-SIMT targets a new option is added to avoid unneded analysis initialization. The method getAnalysisUsage is called when TargetTransformInfo is not yet available and we cannot use it here. For that reason a new field DivergentTarget is added to PassManagerBuilder to control the behavior and set this field from a target. Differential Revision: https://reviews.llvm.org/D30796 llvm-svn: 298104
* [AArch64] Use alias analysis in the load/store optimization pass.Chad Rosier2017-03-171-7/+14
| | | | | | | | This allows the optimization to rearrange loads and stores more aggressively. Differential Revision: http://reviews.llvm.org/D30903 llvm-svn: 298092
* [ARM] Fix triple format in test branch disassemble testAndre Vieira2017-03-171-4/+26
| | | | | | | | | | | | Fixing triple format in the tests added for the branch label fix for Thumb Targets. Also recommitting previously approved patch, see https://reviews.llvm.org/D30943. Reviewed by: samparker Differential Revision: https://reviews.llvm.org/D30987 llvm-svn: 298056
* [AVX-512] Make VEX encoded FMA instructions available when AVX512 is enabled ↵Craig Topper2017-03-171-2/+2
| | | | | | | | | | | | regardless of whether +fma was added on the command line. We weren't able to handle isel of the 128/256-bit FMA instructions when AVX512F was enabled but VLX and FMA weren't. I didn't mask FeatureAVX512 imply FeatureFMA as I wasn't sure I wanted disabling FMA to also disable AVX512. Instead we just can't prevent FMA instructions if AVX512 is enabled. Another option would be to promote 128/256-bit to 512-bit, do the operation and extract it. But that requires a lot of extra isel patterns. Since no CPUs exist that support AVX512, but not FMA just using the VEX instructions seems better. llvm-svn: 298051
* [X86] Remove unused predicate. NFCCraig Topper2017-03-171-1/+0
| | | | llvm-svn: 298050
* [SystemZ] Add use of super-reg in splitMove()Jonas Paulsson2017-03-171-1/+14
| | | | | | | | | | | | | | | | | If one of the subregs of the 128 bit reg is undefined when splitMove() splits a store into two instructions, a use of an undefined physical register results. To remedy this, an implicit use of the super register is added onto both new instructions, along with propagated kill and undef flags. This was discovered with llvm-stress, and that test case is attached as test/CodeGen/SystemZ/splitMove_undefReg_mverifier.ll Thanks to Matthias Braun for helping with a nice explanation. Review: Ulrich Weigand llvm-svn: 298047
* [AVX-512] Give priority to EVEX encoded scalar FMA instructions when we have ↵Craig Topper2017-03-171-7/+9
| | | | | | | | FMA, AVX512 and no VLX. We were giving priority if VLX was enabled. llvm-svn: 298046
* [X86] Cleanup the AddedComplexity values on move immediate instructions. NFCCraig Topper2017-03-172-8/+10
| | | | | | This makes the values a little more consistent between similar instruction and reduces the values some. This results in better grouping in the isel table saving a few bytes. llvm-svn: 298043
* Remove LessPreciseFPMADOption from TargetOptions along with all of theEric Christopher2017-03-171-1/+0
| | | | | | | associated command line options and functions - it's currently unused in all of llvm and clang other than being set and reset. llvm-svn: 298023
* [ARM] Use alias analysis in ARMPreAllocLoadStoreOpt.Eli Friedman2017-03-171-16/+14
| | | | | | | | | | This allows the optimization to rearrange loads and stores more aggressively. This doesn't really affect performance, but it helps codesize. Differential Revision: https://reviews.llvm.org/D30839 llvm-svn: 298021
* clean Lanai namespaceJacques Pienaar2017-03-162-4/+4
| | | | | | | | | | | | | | Summary: This patch cleans the namespace of the Lanai target. Reviewers: jpienaar Reviewed By: jpienaar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30955 llvm-svn: 298015
* Remove getArgumentList() in favor of arg_begin(), args(), etcReid Kleckner2017-03-166-9/+7
| | | | | | | | | | | | | | | | | Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand, is constant time. In general, the fact that arguments are stored in an iplist is an implementation detail, so I've removed it from the Function interface and moved all other users to the argument container APIs (arg_begin(), arg_end(), args(), arg_size()). Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D31052 llvm-svn: 298010
* [WebAssembly] Fix some broken type encodings in wasm binaryDerek Schuff2017-03-162-3/+11
| | | | | | | | | | | | A recent change switch the in-memory wasm value types to be signed integers, but I missing a few cases where these were being writing to the binary. Differential Revision: https://reviews.llvm.org/D31014 Patch by Sam Clegg llvm-svn: 297991
* TargetInstrInfo: Provide default implementation of isTailCall().Matthias Braun2017-03-166-52/+0
| | | | | | | | | | In fact this default implementation should be the only implementation, keep it virtual for now to accomodate targets that don't model flags correctly. Differential Revision: https://reviews.llvm.org/D30747 llvm-svn: 297980
* [globalisel] Correct G_CONSTANT path of selectArithImmed()Daniel Sanders2017-03-161-1/+4
| | | | | | | | | Earlier stages of GlobalISel always use ConstantInt in G_CONSTANT so that's what we should check for. This fixes a crash introduced in r297782. llvm-svn: 297968
* Test commit.Hiroshi Inoue2017-03-161-1/+1
| | | | llvm-svn: 297959
* [AMDGPU] Run always inliner early in optStanislav Mekhanoshin2017-03-161-0/+1
| | | | | | | | | | We can mark functions to always inline early in the opt. Since we do not have call support this early inlining creates opportunities for inter-procedural optimizations which would not occur otherwise. Differential Revision: https://reviews.llvm.org/D31016 llvm-svn: 297958
* [Hexagon] Updating inline saturate lanes for v62 version.Colin LeMahieu2017-03-161-1/+4
| | | | llvm-svn: 297920
* Remove redundant condition (PR32263). NFCI.Simon Pilgrim2017-03-151-1/+1
| | | | llvm-svn: 297915
* AMDGPU: Allow sinking of addressing modes for atomic_inc/decMatt Arsenault2017-03-152-7/+28
| | | | llvm-svn: 297913
* [X86] Add missing BITREVERSE costs for SSE2 vectors and i8/i16/i32/i64 scalarsSimon Pilgrim2017-03-151-0/+19
| | | | | | Prep work for PR31810 llvm-svn: 297876
* [GlobalISel][AArch64] Select ADDXri.Ahmed Bougacha2017-03-151-0/+4
| | | | | | | We're now able to select ADDWri thanks to the new complex pattern support. Extend that to ADDXri. llvm-svn: 297874
OpenPOWER on IntegriCloud