summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Turn off lldb debug tuning by default for FreeBSDDimitry Andric2016-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: In rL242338, debugger tuning was introduced, and the tuning for FreeBSD was set to lldb by default. However, for the foreseeable future we still need to default to gdb tuning, since lldb is not ready for all of FreeBSD's architectures, and some system tools (like objcopy, etc) have not yet been adapted to cope with the lldb tuned format, which has .apple sections. Therefore, let FreeBSD use gdb by default for now. Reviewers: emaste, probinson Subscribers: llvm-commits, emaste Differential Revision: http://reviews.llvm.org/D15966 llvm-svn: 257103
* [SCCP] Don't violate the lattice invariantsDavid Majnemer2016-01-071-15/+42
| | | | | | | | | | We marked values which are 'undef' as constant instead of undefined which violates SCCP's invariants. If we can figure out that a computation results in 'undef', leave it in the undefined state. This fixes PR16052. llvm-svn: 257102
* WebAssembly: update expected failures, more assert got resolved.JF Bastien2016-01-071-2/+0
| | | | llvm-svn: 257098
* Fix crash when printing instructions that have a metadata attached but no ↵Mehdi Amini2016-01-071-1/+1
| | | | | | | | | | | | | | | | | | parent. Fix PR24852 (crash with -debug -instcombine) Patch by Than McIntosh <thanm@google.com> Summary: Add guards to the asm writer to prevent crashing when dumping an instruction that has no basic block. Differential Revision: http://reviews.llvm.org/D15798 From: Than McIntosh <thanm@google.com> llvm-svn: 257094
* WebAssembly: update expected failures, assert got resolved by r257084.JF Bastien2016-01-071-18/+0
| | | | llvm-svn: 257093
* [PGO] Simplify coverage mapping loweringXinliang David Li2016-01-071-23/+11
| | | | | | | | | | | | | | | | Coverage mapping data may reference names of functions that are skipped by FE (e.g, unused inline functions). Since those functions are skipped, normal instr-prof function lowering pass won't put those names in the right section, so special handling is needed to walk through coverage mapping structure and recollect the references. With this patch, only names that are skipped are processed. This simplifies the lowering code and it no longer needs to make assumptions coverage mapping data layout. It should also be more efficient. llvm-svn: 257091
* Remove junk accidentally commited with r257087David Majnemer2016-01-071-1/+1
| | | | llvm-svn: 257089
* [SCCP] Can't go from overdefined to constantDavid Majnemer2016-01-071-3/+3
| | | | | | | | The fix for PR23999 made us mark loads of null as producing the constant undef which upsets the lattice. Instead, keep the load as "undefined". This fixes PR26044. llvm-svn: 257087
* [WebAssembly] Support combining GEP and FrameIndex offsets in memory operand ↵Derek Schuff2016-01-071-6/+12
| | | | | | | | | | | offset field Previously we only supported putting the FI into memory operand offset fields if there was nothing there already. Now combine them. Differential Revision: http://reviews.llvm.org/D15941 llvm-svn: 257084
* [WebAssembly] Use the default private label prefixes.Dan Gohman2016-01-071-3/+0
| | | | | | | | | The MC assembler doesn't like using the empty string as a private label prefix because then it treats all labels as private. This commit reverts back to the default prefix, which is .L, which is common in ELF targets and consistent with the LLVM name mangler. llvm-svn: 257083
* AMDGPU/SI: Fold operands with sub-registersNicolai Haehnle2016-01-074-9/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 llvm-svn: 257074
* AMDGPU/SI: xnack_mask is always reserved on VINicolai Haehnle2016-01-072-36/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Somehow, I first interpreted the docs as saying space for xnack_mask is only reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and went back to actually test what is happening, and it turns out that xnack_mask is always reserved at least on Tonga and Carrizo, in the sense that flat_scr is always fixed below the SGPRs that are used to implement xnack_mask, whether or not they are actually used. I confirmed this by writing a shader using inline assembly to tease out the aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so xnack_mask is s[76:77] and vcc is s[78:79]). This patch changes both the calculation of the total number of SGPRs and the various register reservations to account for this. It ought to be possible to use the gap left by xnack_mask when the feature isn't used, but this patch doesn't try to do that. (Note that the same applies to vcc.) Note that previously, even before my earlier change in r256794, the SGPRs that alias to xnack_mask could end up being used as well when flat_scr was unused and the total number of SGPRs happened to fall on the right alignment (e.g. highest regular SGPR being used s29 and VCC used would lead to number of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there were some conflict due to such aliasing, we should have noticed that already. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15898 llvm-svn: 257073
* [AVX512] add PSLLW and PSLLV IntrinsicMichael Zuckerman2016-01-071-0/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D15889 llvm-svn: 257070
* Revert r257064. It caused failures in some sanitizer tests.Silviu Baranga2016-01-071-322/+3
| | | | llvm-svn: 257069
* Fix build after r257064: we should be returning false, not nullptrSilviu Baranga2016-01-071-2/+2
| | | | llvm-svn: 257067
* Revert r257055, it caused PR26064.Nico Weber2016-01-071-7/+2
| | | | llvm-svn: 257066
* [InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose ↵Silviu Baranga2016-01-071-3/+322
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | more constants when comparing GEPs Summary: When comparing two GEP instructions which have the same base pointer and one of them has a constant index, it is possible to only compare indices, transforming it to a compare with a constant. This removes one use for the GEP instruction with the constant index, can reduce register pressure and can sometimes lead to removing the comparisson entirely. InstCombine was already doing this when comparing two GEPs if the base pointers were the same. However, in the case where we have complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to or from integers, etc) the value of the original base pointer will be hidden to the optimizer and this transformation will be disabled. This change detects when the two sides of the comparison can be expressed as GEPs with the same base pointer, even if they don't appear as such in the IR. The transformation will convert all the pointer arithmetic to arithmetic done on indices and all the relevant uses of GEPs to GEPs with a common base pointer. The GEP comparison will be converted to a comparison done on indices. Reviewers: majnemer, jmolloy Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits Differential Revision: http://reviews.llvm.org/D15146 llvm-svn: 257064
* [AVX512] add PSRAV IntrinsicMichael Zuckerman2016-01-071-0/+7
| | | | | | Differential Revision: http://reviews.llvm.org/D15856 llvm-svn: 257063
* Added support for macro emission in dwarf (supporting DWARF version 4).Amjad Aboud2016-01-077-4/+140
| | | | | | Differential Revision: http://reviews.llvm.org/D15495 llvm-svn: 257060
* [GlobalsAA] Partially back out r248576James Molloy2016-01-071-15/+0
| | | | | | | | | | | | | | | | | | | See PR25822 for a more full summary, but we were conflating the concepts of "capture" and "escape". We were proving nocapture and using that proof to infer noescape, which is not true. Escaped-ness is a function-local property - as soon as a value is used in a call argument it escapes. Capturedness is a related but distinct property. It implies a *temporally limited* escape. Consider: static int a; int b; int g(int * nocapture arg); int f() { a = 2; // Even though a escapes to g, it is not captured so can be treated as non-escaping here. g(&a); // But here it must be treated as escaping. g(&b); // Now that g(&a) has returned we know it was not captured so we can treat it as non-escaping again. } The original commit did not sufficiently understand this nuance and so caused PR25822 and PR26046. r248576 included both a performance improvement (which has been backed out) and a related conformance fix (which has been kept along with its testcase). llvm-svn: 257058
* [AVX512] add PSHUFHW and PSHUFLW Intrinsic Michael Zuckerman2016-01-071-0/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D15925 llvm-svn: 257056
* [X86][AVX] Match broadcast loads through a bitcastSimon Pilgrim2016-01-071-2/+7
| | | | | | | | AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through bitcasts to check for a load node to allow broadcasts to occur. Follow up to D15310 llvm-svn: 257055
* Added AVRTargetObjectFile class and AVR.hDylan McKay2016-01-074-0/+130
| | | | llvm-svn: 257049
* Mark arm as the 32bit variant of aarch64 in TripleTamas Berghammer2016-01-071-28/+28
| | | | | | | | | | Change Triple::get32BitArchVariant to return arm/armeb as the 32bit variant of aarch64/aarch64_be and do the same change for the oppoiste direction in Triple::get64BitArchVariant. Differential revision: http://reviews.llvm.org/D15529 llvm-svn: 257048
* Remove extra whitespace. NFC.Junmo Park2016-01-071-1/+1
| | | | llvm-svn: 257047
* [X86][SSE} Add INSERTPS as a target shuffleSimon Pilgrim2016-01-071-3/+17
| | | | | | Follow up to D15378, added INSERTPS to the list of decodable target shuffles and enabled XFormVExtractWithShuffleIntoLoad to handle target shuffles with SentinelZero and tested this with INSERTPS. llvm-svn: 257046
* [AVX512] add PSHUFD IntrinsicMichael Zuckerman2016-01-071-0/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D15934 llvm-svn: 257044
* ARM: support TLS accesses on Darwin platformsTim Northover2016-01-0710-8/+132
| | | | | | | | Darwin TLS accesses most closely resemble ELF's general-dynamic situation, since they have to be able to handle all possible situations. The descriptors and so on are obviously slightly different though. llvm-svn: 257039
* [SystemZ] Add hasSideEffects flag on Serialize instruction.Jonas Paulsson2016-01-071-0/+3
| | | | | | | | | | Serialize will perform a hardware serialization operation, and is acting as a memory barrier. Therefore it must have the hasSideEffects flag set so it will be treated as a global memory object. Reviewed by Ulrich Weigand llvm-svn: 257036
* [X86] Remove superfluous mayLoad flag. The pattern already implies it.Craig Topper2016-01-071-3/+1
| | | | llvm-svn: 257035
* [X86] Had hasSideEffects=0 to VBROADCASTI128.Craig Topper2016-01-071-1/+1
| | | | llvm-svn: 257034
* [X86] Add OpSize32 to MOVSX32_NOREX instructions to match their other versions.Craig Topper2016-01-071-4/+4
| | | | llvm-svn: 257033
* [X86] Add hasSideEffects=0 and mayLoad=1 to MOVZX64* instructions. While ↵Craig Topper2016-01-072-14/+18
| | | | | | there remove a superfluous _Q from the instruction names. llvm-svn: 257032
* [X86] STOSQ without a rep prefix doesn't read or write RCX.Craig Topper2016-01-071-1/+1
| | | | llvm-svn: 257030
* Undo spurious change made in r256965David Majnemer2016-01-071-2/+1
| | | | llvm-svn: 257028
* [Statepoints] Add test cases around vectors and stablize testPhilip Reames2016-01-071-1/+3
| | | | | | | | | | Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them. Add a couple of tests which show this working. Also, add a triple to the same test file to hopefully fix a failing bot. It turns out we do han llvm-svn: 257025
* [AArch64 MachineCombine] Enhance/Add support for general reassociation to ↵Haicheng Wu2016-01-072-11/+50
| | | | | | | | reduce the critical path Allow fadd/fmul to be reassociated in aarch64. llvm-svn: 257024
* [Statepoints] Initial support for relocating vectors of pointersPhilip Reames2016-01-072-15/+21
| | | | | | | | | | | | Currently, we try to split vectors of pointers back into their component pointer elements during rewrite-statepoints-for-gc. This is less than ideal since presumably the vectorizer chose to vectorize for a reason. :) It's also been a source of bugs - in particular, the relocation logic as currently implemented was recently discovered to be wrong. The alternate approach is to allow gc.relocates of vector-of-pointer type and update the backend to handle them. That's what this patch tries to do. This won't actually enable vector-of-pointers in practice - there are some RS4GC changes needed - but the lowering is standalone and testable so it makes sense to separate. Note that there are some known cases around vector constants which this patch does not handle. Once this is in, I'll send another patch with individual fixes and test cases. Differential Revision: http://reviews.llvm.org/D15632 llvm-svn: 257022
* [WebAssembly] Add -m:e to the target triple.Dan Gohman2016-01-071-2/+3
| | | | | | | This enables ELF-style name mangling, which primarily means using ".L" for private symbols. llvm-svn: 257020
* [Linker] Also treat a DIImportedEntity scope DISubprogram as needed.Ahmed Bougacha2016-01-071-1/+3
| | | | | | | | | Follow-up to r257000: DIImportedEntity can reach a DISubprogram via its entity, but also via its scope. Handle the latter case as well. PR26037. llvm-svn: 257019
* [RS4GC] Add an option to suppress vector splittingPhilip Reames2016-01-071-9/+20
| | | | | | At the moment, this is essentially a diangostic option so that I can start collecting failing test cases, but we will eventually migrate to removing the vector splitting code entirely. llvm-svn: 257015
* [libFuzzer] add a position hint to the dictionary-based mutatorKostya Serebryany2016-01-076-33/+100
| | | | llvm-svn: 257013
* [ShrinkWrapping] Give up on irreducible CFGs.Quentin Colombet2016-01-071-0/+47
| | | | | | | | | | | We need to know whether or not a given basic block is in a loop for the analysis to be correct. Loop information may be incomplete on irreducible CFGs, therefore we may generate incorrect code if we use it in those situations. This fixes PR25988. llvm-svn: 257012
* Always treat DISubprogram reached by DIImportedEntity as needed.Teresa Johnson2016-01-071-1/+11
| | | | | | | | | | It is illegal to have a null entity in a DIImportedEntity, so we must link in a DISubprogram metadata node referenced by one, even if the associated function is not linked in or inlined anywhere. Fixes PR26037. llvm-svn: 257000
* Fix PR26051: Memcpy optimization should introduce a call to memcpy before ↵Mehdi Amini2016-01-061-2/+4
| | | | | | | | | | the store destination position This is a conservative fix, I expect Amaury to relax this. Follow-up for r256923 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 256999
* rangify; NFCISanjay Patel2016-01-061-56/+31
| | | | llvm-svn: 256998
* [X86] Determine if target shuffle can contain zero elementsSimon Pilgrim2016-01-061-15/+17
| | | | | | | | | | | | | | getTargetShuffleMask may return shuffle masks with SM_SentinelZero (-2) values (currently just for PSHUFB but VPERM2X128 as well with this patch). Although some calling functions can make use of this (mainly for shuffle combining), others can not and their inclusion makes shuffle mask comparisons more difficult. This patch adds a flag to getTargetShuffleMask to indicate if the calling function can't handle SM_SentinelZero; getTargetShuffleMask will then return false if it occurs to make handling much easier. I've tidied up some uses of getTargetShuffleMask to better indicate what is going on - more could be done but at present I don't have test cases to demonstrate it. Some upcoming patches will make use of this to both support more uses where SM_SentinelZero is not permitted (e.g. combineShuffleToAddSub), and also will allow us to add INSERTPS support to getTargetShuffleMask as part of better zero handling discussed in D14261. Differential Revision: http://reviews.llvm.org/D15378 llvm-svn: 256992
* Recommit r256952 "Filtering IR printing for print-after-all/print-before-all"Weiming Zhao2016-01-065-7/+34
| | | | | | | | Fix lit test fail due to outputting an extra line. Differential Revision: http://reviews.llvm.org/D15776 llvm-svn: 256987
* Bitcode: Fix reading and writing of ConstantDataVectors of halfsJustin Bogner2016-01-062-23/+15
| | | | | | | | | | | | In r254991 I allowed ConstantDataVectors to contain elements of HalfTy, but I missed updating the bitcode reader and writer to handle this, so now we crash if we try to emit bitcode on programs that have constant vectors of half. This fixes the issue and adds test coverage for reading and writing constant sequences in bitcode. llvm-svn: 256982
* AMDGPU/SI: Fix crash when inline assembly is used in a graphics shaderNicolai Haehnle2016-01-061-0/+3
| | | | | | | | | | | | | | | Summary: This is admittedly something that you could only run into by manually playing around with shader assembly because the SITypeWriter pass is skipped for compute. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15902 llvm-svn: 256980
OpenPOWER on IntegriCloud