summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* NFC: MergeFunctions return earlyJF Bastien2016-04-121-1/+1
| | | | | | Same effect, easier to read. llvm-svn: 266128
* AMDGPU: add llvm.amdgcn.buffer.load/store intrinsicsNicolai Haehnle2016-04-121-59/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and atomic counters at least when robust buffer access behavior is desired. (These instructions perform no format conversion and do buffer range checking per component.) As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format, it has become trivial to add support for the f32 and v2f32 variants of that intrinsic, so the patch does so. Also DAG-ify (and fix) some tests that I noticed intermittent failures in while developing this patch. Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects changes to the BUFFER_STORE_DWORD* instructions. See also http://reviews.llvm.org/D18291. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18292 llvm-svn: 266126
* [ThinLTO] Only compute imports for current module in FunctionImport passTeresa Johnson2016-04-122-24/+68
| | | | | | | | | | | | | | | | | | | | | Summary: The function import pass was computing all the imports for all the modules in the index, and only using the imports for the current module. Change this to instead compute only for the given module. This means that the exports list can't be populated, but they weren't being used anyway. Longer term, the linker can collect all the imports and export lists and serialize them out for consumption by the distributed backend processes which use this pass. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18945 llvm-svn: 266125
* NFC: MergeFunctions update more commentsJF Bastien2016-04-121-7/+8
| | | | | | They are wordy. Some words were wrong. llvm-svn: 266124
* Add __atomic_* lowering to AtomicExpandPass.James Y Knight2016-04-123-9/+562
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115
* AMDGPU/SI: Insert wait states required after v_readfirstlane on SITom Stellard2016-04-121-0/+6
| | | | | | | | | | | | | | | | | Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 llvm-svn: 266105
* AMDGPU: Eliminate half of i64 or if one operand is zero_extend from i32Matt Arsenault2016-04-121-0/+30
| | | | | | | | | | This helps clean up some of the mess when expanding unaligned 64-bit loads when changed to be promote to v2i32, and fixes situations where or x, 0 was emitted after splitting 64-bit ors during moveToVALU. I think this could be a generic combine but I'm not sure. llvm-svn: 266104
* [IR/Verifier] Each DISubprogram with isDefinition: true must belong to a CU.Davide Italiano2016-04-121-0/+16
| | | | | | | | | | | Add a check to catch violations. ~60 tests were broken and prevented this change to be committed. Adrian and I (thanks Adrian!) went through them in the last week or so updating. The check can be done more efficiently but I'd still like to get this in ASAP to avoid more broken tests to be checked in (if any). PR: 27101 llvm-svn: 266102
* [CodeGen] Remove constant-folding dead code. NFC.Ahmed Bougacha2016-04-121-12/+4
| | | | | | | | | | | This code was specific to vector operations with scalar operands: all the opcodes in FoldValue (via FoldConstantArithmetic) can't match those criteria. Replace it with an assert if that ever changes: at that point, we might need to add back a splat BUILD_VECTOR. llvm-svn: 266100
* Check alloca's special stateJF Bastien2016-04-121-0/+4
| | | | | | Following up to a similar fix in MergeFunctions: r266022. This patch keeps both in sync, it would be nice to not have to do this. It doesn't look like there's an easy way to test this code directly at the moment: AFAICT all currect uses of isSameOperationAs are looking at instructions deep inside a function. IndVarSimplify/pr24952.ll and InstMerge/st_sink_* look at alloca inadvertently but are brittle tests. llvm-svn: 266099
* Introduce an GCRelocateInst class [NFC]Philip Reames2016-04-125-16/+9
| | | | | | Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098
* fix indentation; NFCSanjay Patel2016-04-121-12/+12
| | | | llvm-svn: 266097
* AMDGPU/SI: Fix a mis-compilation of multi-level breaksNicolai Haehnle2016-04-121-0/+16
| | | | | | | | | | | | | | | | | | Summary: Under certain circumstances, multi-level breaks (or what is understood by the control flow passes as such) could be miscompiled in a way that causes infinite loops, by emitting incorrect control flow intrinsics. This fixes a hang in dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.conditional_continue_vertex Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18967 llvm-svn: 266088
* Support arbitrary addrspace pointers in masked load/store intrinsicsArtur Pilipenko2016-04-122-10/+49
| | | | | | | | | | | | | | This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086
* [ScheduleDAGInstrs] Handle instructions with multiple MMOsGeoff Berry2016-04-121-30/+41
| | | | | | | | | | | | | | | | | | | | | Summary: In getUnderlyingObjectsForInstr(): Don't give up on instructions with multiple MMOs, instead look through all the MMOs and if they all meet the conservative criteria previously used for single MMO instructions, then return all of the underlying objects derived from the MMOs. The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid the case where multiple underlying objects are present and are related in such a way that successive iterations of the loop end up adding a dependency from an instruction to itself. Reviewers: atrick, hfinkel Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18093 llvm-svn: 266084
* [mips] add assembler support for .set arch=octeonPetar Jovanovic2016-04-121-1/+1
| | | | | | | | | | | This patch enables assembler support for .set arch=octeon. It will fix issues with inline assembler when this directive is used. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18548 llvm-svn: 266081
* AMDGPU: Implement i64 global atomicsMatt Arsenault2016-04-122-14/+44
| | | | llvm-svn: 266075
* AMDGPU: Add atomic_inc + atomic_dec intrinsicsMatt Arsenault2016-04-128-11/+101
| | | | | | | These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074
* AMDGPU: Remove trailing whitespaceMatt Arsenault2016-04-121-7/+7
| | | | llvm-svn: 266073
* This reverts commit r266002, r266011 and r266016.Rafael Espindola2016-04-123-562/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062
* Revert "[mips] MIPSR6 Compact branch aliases"Simon Dardis2016-04-122-7/+1
| | | | | | | | This reverts commit r266055. ps4-buildslave2 is highlighting a failure. llvm-svn: 266061
* [SystemZ] Use LDE32 instead of LE, when Offset is small.Jonas Paulsson2016-04-121-1/+7
| | | | | | | | | | | | On z13, if eliminateFrameIndex() chooses LE (and not LEY), immediately transform that LE to LDE32 to avoid partial register dependencies. LEY should be generally preferred for big offsets over an expansion into LAY + LDE32. Reviewed by Ulrich Weigand. llvm-svn: 266060
* [mips] MIPSR6 Compact branch aliasesSimon Dardis2016-04-122-1/+7
| | | | | | | | | | | | Summary: Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D18856 llvm-svn: 266055
* Avoid GCC -fpermissive error about llvm::Mangler hidden by member named ManglerStephan Bergmann2016-04-121-1/+1
| | | | llvm-svn: 266049
* Refactor the Internalize stage of libLTO in a separate file (NFC)Mehdi Amini2016-04-124-135/+230
| | | | | | | | | | | | | | | | | This is intended to be shared by the ThinLTOCodeGenerator. Note that there is a change in the way the verifier is run, previously it was ran as a Pass on the merged module during internalization. While now the verifier is called explicitely on the merged module outside of the internalize "pass pipeline". What remains strange in the API is the fact that `DisableVerify` in the API does not disable this initial verifier. Differential Revision: http://reviews.llvm.org/D19000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266047
* [PPC64] Mark CR0 Live if PPCInstrInfo::optimizeCompareInstr Creates a Use of CR0Chuang-Yu Cheng2016-04-121-0/+4
| | | | | | | | | | | | | | Resolve Bug 27046 (https://llvm.org/bugs/show_bug.cgi?id=27046). The PPCInstrInfo::optimizeCompareInstr function could create a new use of CR0, even if CR0 were previously dead. This patch marks CR0 live if a use of CR0 is created. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D18884 llvm-svn: 266040
* [PPC64] Use mfocrf in prologue when we only need to save 1 nonvolatile CR fieldChuang-Yu Cheng2016-04-121-8/+16
| | | | | | | | | | | | | | In the ELFv2 ABI, we are not required to save all CR fields. If only one nonvolatile CR field is clobbered, use mfocrf instead of mfcr to selectively save the field, because mfocrf has short latency compares to mfcr. Thanks Nemanja's invaluable hint! Reviewers: nemanjai tjablin hfinkel kbarton http://reviews.llvm.org/D17749 llvm-svn: 266038
* AArch64: Drive-by cleanupMatthias Braun2016-04-121-3/+2
| | | | llvm-svn: 266035
* Attempt to make buildbot happier with r266032.George Burgess IV2016-04-121-2/+1
| | | | | | | Apparently std::numeric_limits<unsigned>::max() isn't constexpr everywhere yet. llvm-svn: 266034
* Add the allocsize attribute to LLVM.George Burgess IV2016-04-1211-65/+321
| | | | | | | | | | | | | | | | `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032
* [RegBankSelect] Teach the repairing code how to handle physicalQuentin Colombet2016-04-121-2/+6
| | | | | | registers. llvm-svn: 266029
* [RegisterBankInfo] Do not provide a default mapping for non-reg of phiQuentin Colombet2016-04-121-0/+7
| | | | | | operations. llvm-svn: 266027
* [RegBankSelect] Teach how to repair definitions.Quentin Colombet2016-04-121-13/+112
| | | | | | | | | Although repairing definitions is not mandatory for correctness (only phis would be impacted because of the RPO traversal), not repairing might go against the cost model. Therefore, just repair when it is possible. llvm-svn: 266025
* MergeFunctions: test alloca betterJF Bastien2016-04-121-9/+6
| | | | | | r237193 fix handling of alloca size / align in MergeFunctions, but only tested one and didn't follow FunctionComparator::cmpOperations's usual comparison pattern. It also didn't update Instruction.cpp:haveSameSpecialState which I'll do separately. llvm-svn: 266022
* Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionPropertyDerek Schuff2016-04-112-9/+7
| | | | | | | | | | Use the MachineFunctionProperty mechanism to indicate whether the liveness info is accurate instead of a bool flag on MRI. Keeps the MRI accessor function for convenience. NFC Differential Revision: http://reviews.llvm.org/D18767 llvm-svn: 266020
* ThinLTO renaming: use module hash instead of position in the summaryMehdi Amini2016-04-111-1/+1
| | | | | | | | | This is more robust to changes in the link ordering. Differential Revision: http://reviews.llvm.org/D18946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266018
* AtomicExpandPass: mark assert variable as usedJF Bastien2016-04-111-0/+3
| | | | | | Avoid -Wunused-variable llvm-svn: 266016
* Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass)James Y Knight2016-04-111-8/+8
| | | | | | | | It doesn't like implicitly calling the ArrayRef constructor with a returned array -- it appears to decays the returned value to a pointer, first, before trying to make an ArrayRef out of it. llvm-svn: 266011
* CodeGen: Fix a use-after-free in TailDuplicationJustin Bogner2016-04-111-2/+0
| | | | | | | | | | The call to processPHI already erased MI from its parent, so MI isn't even valid here, making the getParent() call a use-after-free in addition to being redundant. Found by ASan with the ArrayRecycler changes in llvm.org/pr26808. llvm-svn: 266008
* NFC: keep comment up to dateJF Bastien2016-04-111-4/+4
| | | | | | MergeFunctions was refactored a while ago, and Instruction.cpp's comments went out of sync. The content did as well, will fix later. llvm-svn: 266007
* [safestack] Add canary to unsafe stack framesEvgeniy Stepanov2016-04-113-26/+80
| | | | | | | | Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004
* ARM: use r7 as the frame-pointer on all MachO targets.Tim Northover2016-04-112-13/+10
| | | | | | | | | | | | This is better for a few reasons: + It matches the other tooling for iOS. + It matches EABI in more cases (i.e. Thumb-mode, and in practice we don't use ARM mode). + It leads to infinitesimally smaller code (0.2%, yay!). rdar://25369506 llvm-svn: 266003
* Add __atomic_* lowering to AtomicExpandPass.James Y Knight2016-04-113-9/+559
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002
* [DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) ↵Simon Pilgrim2016-04-111-2/+2
| | | | | | | | | | | | | | anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998
* Swift Calling Convention: swifterror target support.Manman Ren2016-04-1114-7/+219
| | | | | | Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997
* Revert "AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute"Tom Stellard2016-04-111-1/+1
| | | | | | | | This reverts commit r263720. Just confirmed that s_waitcnt is required after ds_permute/ds_bpermute. llvm-svn: 265992
* Fix broken assert, PR24624Hans Wennborg2016-04-111-1/+1
| | | | llvm-svn: 265989
* Remove redundant .c_str(), as suggested by PR25633Hans Wennborg2016-04-111-1/+1
| | | | llvm-svn: 265988
* Fix a couple of redundant conditional expressions (PR27283, PR28282)Hans Wennborg2016-04-112-3/+3
| | | | llvm-svn: 265987
* use range-loops; NFCISanjay Patel2016-04-111-13/+8
| | | | llvm-svn: 265985
OpenPOWER on IntegriCloud