summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU] Assembler: better support for immediate literals in assembler.Sam Kolton2016-09-0914-351/+708
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050
* [Sparc][LEON] Removed the parts of the errata fixes implemented using inline ↵Chris Dewhurst2016-09-091-76/+0
| | | | | | assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly. llvm-svn: 281047
* [ARM] ADD with a negative offset can become SUB for freeJames Molloy2016-09-091-0/+4
| | | | | | So model that directly in TTI::getIntImmCost(). llvm-svn: 281044
* [ARM] icmp %x, -C can be lowered to a simple ADDS or CMNJames Molloy2016-09-091-0/+11
| | | | | | Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043
* [SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar typeSimon Pilgrim2016-09-092-3/+3
| | | | | | Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042
* [Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0)James Molloy2016-09-091-0/+42
| | | | | | The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040
* GlobalISel: remove G_TYPE and G_PHITim Northover2016-09-095-20/+3
| | | | | | | | These instructions were only necessary when type information was stored in the MachineInstr (because only generic MachineInstrs possessed a type). Now that it's in MachineRegisterInfo, COPY and PHI work fine. llvm-svn: 281037
* GlobalISel: fix comments and add assertions for valid instructions.Tim Northover2016-09-091-4/+88
| | | | llvm-svn: 281036
* GlobalISel: move type information to MachineRegisterInfo.Tim Northover2016-09-0917-383/+275
| | | | | | | | | | | | | | | | | We want each register to have a canonical type, which means the best place to store this is in MachineRegisterInfo rather than on every MachineInstr that happens to use or define that register. Most changes following from this are pretty simple (you need an MRI anyway if you're going to be doing any transformations, so just check the type there). But legalization doesn't really want to check redundant operands (when, for example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's operand type field to encode these constraints and limit legalization's work. As an added bonus, more validation is possible, both in MachineVerifier and MachineIRBuilder (coming soon). llvm-svn: 281035
* Revert "[mips] Fix c.<cc>.<fmt> instruction definition."Simon Dardis2016-09-0915-539/+209
| | | | | | | This reverts commit r281022. Mips buildbot broke, due to unhandled register class FCC. llvm-svn: 281033
* [AMDGPU] Assembler: rename amd_kernel_code_t asm names according to specSam Kolton2016-09-093-242/+85
| | | | | | | | | | | | | | Summary: Also removed duplicate code from AMDGPUTargetAsmStreamer. This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D24296 llvm-svn: 281028
* [Thumb1] Teach optimizeCompareInstr about thumb1 comparesJames Molloy2016-09-091-4/+21
| | | | | | | | This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags. Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean. llvm-svn: 281027
* [AMDGPU] Assembler: match e32 VOP instructions before e64.Sam Kolton2016-09-097-32/+126
| | | | | | | | | | | | | | | | | | | Summary: Split assembler match table in 4 tables with assembler variants: Default - all instructions except VOP3, SDWA and DPP - VOP3 - SDWA - DPP First match Default table then VOP3, SDWA and DPP. Reviewers: tstellarAMD, artem.tamazov, vpykhtin Subscribers: arsenm, wdng, nhaehnle, AMDGPU Differential Revision: https://reviews.llvm.org/D24252 llvm-svn: 281023
* [mips] Fix c.<cc>.<fmt> instruction definition.Simon Dardis2016-09-0915-209/+539
| | | | | | | | | | | | | | | As part of this effort, remove MipsFCmp nodes and use tablegen patterns rather than custom lowering through C++. Unexpectedly, this improves codesize for microMIPS as previous floating point setcc expansions would materialize 0 and 1 into GPRs before using the relevant mov[tf].[sd] instruction. Now $zero is used directly. Reviewers: dsanders, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D23118 llvm-svn: 281022
* [Coroutines] Part13: Handle single edge PHINodes across suspendsGor Nishanov2016-09-093-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If one of the uses of the value is a single edge PHINode, handle it. Original: %val = something <suspend> %p = PHINode [%val] After Spill + Part13: %val = something %slot = gep val.spill.slot store %val, %slot <suspend> %p = load %slot Plus tiny fixes/changes: * use correct index for coro.free in CoroCleanup * fixup id parameter in coro.free to allow authoring coroutine in plain C with __builtins Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24242 llvm-svn: 281020
* Rationalise the attribute getter/setter methods on Function and CallSite.Amaury Sechet2016-09-092-40/+4
| | | | | | | | | | | | | | | | | | | Summary: While woring on mapping attributes in the C API, it clearly appeared that the recent changes in the API on the C++ side left Function and Call/Invoke with an attribute API that grew in an ad hoc manner. This makes it difficult to work with it, because one doesn't know which overloads exists and which do not. Make sure that getter/setter function exists for both enum and string version. Remove inconsistent getter/setter, unless they have many callsites. This should make it easier to work with attributes in the future. This doesn't change how attribute works. Reviewers: bkramer, whitequark, mehdi_amini, void Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21514 llvm-svn: 281019
* [libFuzzer] improve -print_pcs to not print new PCs coming from libFuzzer itselfKostya Serebryany2016-09-092-8/+19
| | | | llvm-svn: 281016
* [libFuzzer] remove unneeded callKostya Serebryany2016-09-092-9/+0
| | | | llvm-svn: 281014
* [AVX-512] Add VPCMP instructions to the load folding tables and make them ↵Craig Topper2016-09-092-1/+57
| | | | | | commutable. llvm-svn: 281013
* [libFuzzer] remove use_traces=1 since use_value_profile seems to be strictly ↵Kostya Serebryany2016-09-096-67/+9
| | | | | | better llvm-svn: 281007
* [X86] Tighten up a comment which confused x64 ABI terminology.David Majnemer2016-09-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | The x64 ABI has two major function types: - frame functions - leaf functions A frame function is one which requires a stack frame. A leaf function is one which does not. A frame function may or may not have a frame pointer. A leaf function does not require a stack frame and may never modify SP except via a return (RET, tail call via JMP). A frame function which has a frame pointer is permitted to use the LEA instruction in the epilogue, a frame function without which doesn't establish a frame pointer must use ADD to adjust the stack pointer epilogue. Fun fact: Leaf functions don't require a function table entry (associated PDATA/XDATA). llvm-svn: 281006
* Win64: Don't use REX prefix for direct tail callsHans Wennborg2016-09-085-8/+4
| | | | | | | | | | The REX prefix should be used on indirect jmps, but not direct ones. For direct jumps, the unwinder looks at the offset to determine if it's inside the current function. Differential Revision: https://reviews.llvm.org/D24359 llvm-svn: 281003
* Remove debug info when hoisting instruction from then/else branch.Dehao Chen2016-09-081-0/+8
| | | | | | | | | | | | Summary: The hoisted instruction is executed speculatively. It could affect the debugging experience as user would see gdb go into code that may not be expected to execute. It will also affect sample profile accuracy by assigning incorrect frequency to source within then/else branch. Reviewers: davidxl, dblaikie, chandlerc, kcc, echristo Subscribers: mehdi_amini, probinson, eric_niebler, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D24164 llvm-svn: 280995
* [LV] Ensure proper handling of multi-use case when collecting uniformsMatthew Simpson2016-09-081-5/+5
| | | | | | | | | | | The test case included in r280979 wasn't checking what it was supposed to be checking for the predicated store case. Fixing the test revealed that the multi-use case (when a pointer is used by both vectorized and scalarized memory accesses) wasn't being handled properly. We can't skip over non-consecutive-like pointers since they may have looked consecutive-like with a different memory access. llvm-svn: 280992
* [RDF] Further improve handling of multiple phis reached from shadowsKrzysztof Parzyszek2016-09-081-31/+16
| | | | llvm-svn: 280987
* [LV] Don't mark pointers used by scalarized memory accesses uniformMatthew Simpson2016-09-081-42/+143
| | | | | | | | | | | | | | | | | | Previously, all consecutive pointers were marked uniform after vectorization. However, if a consecutive pointer is used by a memory access that is eventually scalarized, the pointer won't remain uniform after all. An example is predicated stores. Even though a predicated store may be consecutive, it will still be scalarized, making it's pointer operand non-uniform. This patch updates the logic in collectLoopUniforms to consider the cases where a memory access may be scalarized. If a memory access may be scalarized, its pointer operand is not marked uniform. The determination of whether a given memory instruction will be scalarized or not has been moved into a common function that is used by the vectorizer, cost model, and legality analysis. Differential Revision: https://reviews.llvm.org/D24271 llvm-svn: 280979
* [YAMLIO] Add the ability to map with context.Zachary Turner2016-09-081-1/+2
| | | | | | | | | | | | | | | | | | | | | mapping a yaml field to an object in code has always been a stateless operation. You could still pass state by using the `setContext` function of the YAMLIO object, but this represented global state for the entire yaml input. In order to have context-sensitive state, it is necessary to pass this state in at the granularity of an individual mapping. This patch adds support for this type of context-sensitive state. You simply pass an additional argument of type T to the `mapRequired` or `mapOptional` functions, and provided you have specialized a `MappingContextTraits<U, T>` class with the appropriate mapping function, you can pass this context into the mapping function. Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D24162 llvm-svn: 280977
* AMDGPU: Sign extend constants when splitting themMatt Arsenault2016-09-081-3/+2
| | | | | | | This will confuse later passes which try to look at the immediate value and don't truncate first. llvm-svn: 280974
* [Hexagon] Expand sext- and zextloads of vector types, not just extloadsKrzysztof Parzyszek2016-09-081-1/+5
| | | | | | Recent change exposed this issue, breaking the Hexagon buildbots. llvm-svn: 280973
* AMDGPU: Try to commute when selecting s_addk_i32/s_mulk_i32Matt Arsenault2016-09-081-9/+14
| | | | llvm-svn: 280972
* AArch64 .arch directive - Include default arch attributes with extensions.Eric Christopher2016-09-081-3/+39
| | | | | | | | Fix the .arch asm parser to use the full set of features for the architecture and any extensions on the command line. Add and update testcases accordingly as well as add an extension that was used but not supported. llvm-svn: 280971
* AMDGPU: Support commuting with immediate in src0Matt Arsenault2016-09-082-98/+81
| | | | llvm-svn: 280970
* Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"Renato Golin2016-09-0811-213/+61
| | | | | | | | | | And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967
* [LoopDataPrefetch] Use range based for loop; NFCIBalaram Makam2016-09-081-17/+12
| | | | | | | Switch to range based for loop. No functional change, but more readable code. llvm-svn: 280966
* [InstCombine] return a vector-safe true/false constantSanjay Patel2016-09-081-2/+2
| | | | | | | | | | | I introduced this potential bug by missing this diff in: https://reviews.llvm.org/rL280873 ...however, I'm not sure how to reach this code path with a regression test. We may be able to remove this code and assume that the transform to a constant is always handled by InstSimplify? llvm-svn: 280964
* revert r280427Dehao Chen2016-09-082-6/+4
| | | | | | | Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself. Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking. llvm-svn: 280949
* [ARM XRay] Try to fix Thumb-only failureRenato Golin2016-09-081-1/+1
| | | | | | | I mised the check that it had to support ARM to work. This commit tries to fix that, to make sure we don't emit ARM code in Thumb-only mode. llvm-svn: 280935
* [SDAGBuilder] Don't create a binary tree for switches in minsize modeJames Molloy2016-09-081-1/+2
| | | | | | This bloats codesize - all of the non-leaf nodes are extra code. llvm-svn: 280932
* [Thumb1] AND with a constant operand can be converted into BICJames Molloy2016-09-081-0/+4
| | | | | | | So model the cost of materializing the constant operand C as the minimum of C and ~C. llvm-svn: 280929
* [Thumb1] Fix cost calculation for complemented immediatesJames Molloy2016-09-081-1/+1
| | | | | | | | | | | | Materializing something like "-3" can be done as 2 instructions: MOV r0, #3 MVN r0, r0 This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256. There were no tests failing after changing this... :/ llvm-svn: 280928
* [SelectionDAG] Add BUILD_VECTOR support to computeKnownBits and ↵Simon Pilgrim2016-09-082-0/+47
| | | | | | | | | | | | SimplifyDemandedBits Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element. This is an initial step towards determining the sign bits of a vector (PR29079). Differential Revision: https://reviews.llvm.org/D24253 llvm-svn: 280927
* [DAGCombiner] Enable AND combines of splatted constant vectorsSimon Pilgrim2016-09-081-7/+7
| | | | | | | | Allow AND combines to use a vector splatted constant as well as a constant scalar. Preliminary part of D24253. llvm-svn: 280926
* Revert "[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)"Pablo Barrio2016-09-081-18/+1
| | | | | | | | | | This reverts commit r280808. It is possible that this change results in an infinite loop. This is causing timeouts in some tests on ARM, and a Chromebook bot is failing. llvm-svn: 280918
* [mips][microMIPS] Implement DBITSWAP, DLSA and LWUPC and add tests for AUI ↵Hrvoje Varga2016-09-088-12/+126
| | | | | | | | instructions Differential Revision: https://reviews.llvm.org/D16452 llvm-svn: 280909
* [asan] Avoid lifetime analysis for allocas with can be in ambiguous stateVitaly Buka2016-09-081-0/+75
| | | | | | | | | | | | | | | | | | Summary: C allows to jump over variables declaration so lifetime.start can be avoid before variable usage. To avoid false-positives on such rare cases we detect them and remove from lifetime analysis. PR27453 PR28267 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24321 llvm-svn: 280907
* Revert "[LoopUnroll] Properly update loop-info when cloning prologues and ↵Michael Zolotukhin2016-09-081-54/+11
| | | | | | | | | | epilogues." This reverts commit r280901. This caused a bunch of failures, reverting it until I investigate them. llvm-svn: 280905
* [LoopUnroll] Properly update loop-info when cloning prologues and epilogues.Michael Zolotukhin2016-09-081-11/+54
| | | | | | | | | | | | | | | | | | | Summary: When cloning blocks for prologue/epilogue we need to replicate the loop structure from the original loop. It wasn't a problem for the innermost loops, but it led to an incorrect loop info when we unrolled a loop with a child loop - in this case created prologue-loop had a child loop, but loop info didn't reflect that. This fixes PR28888. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas Differential Revision: https://reviews.llvm.org/D24203 llvm-svn: 280901
* [CGP] Be less conservative about tail-duplicating a ret to allow tail callsMichael Kuperstein2016-09-082-26/+34
| | | | | | | | | | | | | | | | | CGP tail-duplicates rets into blocks that end with a call that feed the ret. This puts the call in tail position, potentially allowing the DAG builder to lower it as a tail call. To avoid tail duplication in cases where we won't form the tail call, CGP tried to predict whether this is going to be possible, and avoids doing it when lowering as a tail call will definitely fail. However, it was being too conservative by always throwing away calls to functions with a signext/zeroext attribute on the return type. Instead, we can use the same logic the builder uses to determine whether the attributes work out. Differential Revision: https://reviews.llvm.org/D24315 llvm-svn: 280894
* [XRay] Remove unused variableDean Michael Berris2016-09-081-2/+2
| | | | llvm-svn: 280891
* [XRay] ARM 32-bit no-Thumb support in LLVMDean Michael Berris2016-09-0811-61/+213
| | | | | | | | | | | | This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: 1. https://reviews.llvm.org/D23932 (Clang test) 2. https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 280888
OpenPOWER on IntegriCloud