summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Enable the X86 call frame optimization for the 64-bit targets that allow it.David L Kreitzer2016-05-022-16/+36
| | | | | | | | Fixes PR27241. Differential Revision: http://reviews.llvm.org/D19688 llvm-svn: 268227
* Expose a getFullName for thin archive members.Rafael Espindola2016-05-021-10/+18
| | | | | | It will be used in lld. llvm-svn: 268226
* [SystemZ] Fix in restoreCalleeSavedRegisters()Jonas Paulsson2016-05-021-1/+2
| | | | | | | | Only add operands for GRs to the LMG. Reviewed by Ulrich Weigand. llvm-svn: 268216
* [SystemZ] Mark CC defs as dead whenever possible.Jonas Paulsson2016-05-023-5/+25
| | | | | | | | | | | | | | Marking implicit CC defs as dead everywhere except when CC is actually defined and used explicitly, is important since the post-ra scheduler will otherwise insert edges between instructions unnecessarily. Also temporarily disable LA(Y)-> AGSI optimization in foldMemoryOperandImpl(), since this inroduces a def of the CC reg, which is illegal unless it is known to be dead. Reviewed by Ulrich Weigand. llvm-svn: 268215
* [X86] Fix a bug in LOCK arithmetic operation pattern matching where the ↵Craig Topper2016-05-021-1/+1
| | | | | | | | wrong immediate predicate check was being used for 64-bit instructions with 8-bit immediates. This didn't cause a bug because the order of the patterns ensured that the 64-bit instructions with 32-bit immediates were selected first. llvm-svn: 268212
* Fix grammar and correct comment - the debug information wasn't incorrect, ↵Eric Christopher2016-05-021-2/+2
| | | | | | rather suboptimal. llvm-svn: 268211
* [CodeGen] Add OPC_MoveChild0-OPC_MoveChild7 opcodes to isel matching tables ↵Craig Topper2016-05-021-0/+12
| | | | | | to optimize table size. Shaves about 12K off the X86 matcher table. llvm-svn: 268209
* [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to ↵Simon Pilgrim2016-05-011-8/+13
| | | | | | accept UNDEF elements. llvm-svn: 268206
* [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept ↵Simon Pilgrim2016-05-011-20/+27
| | | | | | UNDEF elements. llvm-svn: 268204
* [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept ↵Simon Pilgrim2016-05-011-16/+17
| | | | | | UNDEF elements. llvm-svn: 268202
* [AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While ↵Craig Topper2016-05-011-4/+4
| | | | | | there fix the execution domain for VPACKSSDW/VPACKUSDW. llvm-svn: 268200
* [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to ↵Simon Pilgrim2016-05-011-0/+37
| | | | | | shufflevector. llvm-svn: 268199
* Fixed MSVC 'not all control paths return a value' warningSimon Pilgrim2016-05-011-0/+1
| | | | llvm-svn: 268198
* getelementptr instruction, support index vector of EVT.Igor Breger2016-05-011-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195
* Change AVX512 braodcastsd/ss patterns interaction with spilling . New ↵Igor Breger2016-05-013-110/+98
| | | | | | | | implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190
* [AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when ↵Craig Topper2016-05-011-3/+3
| | | | | | VLX and BWI are supported. llvm-svn: 268189
* [AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB ↵Craig Topper2016-05-011-13/+14
| | | | | | and VPMADDUBSW/VPMADDWD. llvm-svn: 268188
* [AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also ↵Craig Topper2016-05-011-16/+16
| | | | | | marked as requiring VLX. llvm-svn: 268186
* [X86] Add an AddedComplexity to another pattern to put it near similar in ↵Craig Topper2016-05-011-2/+1
| | | | | | the output file. llvm-svn: 268184
* [X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere ↵Craig Topper2016-05-011-2/+0
| | | | | | with an AddedComplexity that made this unreachable. llvm-svn: 268183
* [X86] Add AddedComplexity to keep some similar patterns near each other in ↵Craig Topper2016-05-011-0/+1
| | | | | | the output file. llvm-svn: 268181
* [X86] Remove some redundant selection patterns.Craig Topper2016-05-012-11/+0
| | | | llvm-svn: 268180
* [AVX512] Replace vector_extract with extractelt in some patterns. They mean ↵Craig Topper2016-05-011-5/+5
| | | | | | the same thing but vector_extract is deprecated. NFC llvm-svn: 268179
* [SCEV] When printing via -analysis, dump loop dispositionSanjoy Das2016-05-011-0/+25
| | | | | | | | | | | There are currently some bugs in tree around SCEV caching an incorrect loop disposition. Printing out loop dispositions will let us write whitebox tests as those are fixed. The dispositions are printed as a list in "inside out" order, i.e. innermost loop first. llvm-svn: 268177
* Properly name LLVMSetIsInBounds's argument. NFCAmaury Sechet2016-05-011-2/+2
| | | | llvm-svn: 268176
* [AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions.Craig Topper2016-05-011-4/+7
| | | | llvm-svn: 268174
* [ORC] Save AArch64 NEON state in the JIT reentry block.Lang Hames2016-05-011-42/+74
| | | | | | | The earlier version of the resolver code did not save NEON state, so it would have broken any callees that used floating point. llvm-svn: 268173
* CodeGen: convert to range based loopsSaleem Abdulrasool2016-04-301-36/+20
| | | | | | | Convert to using some range based loops, avoid unnecessary variables for unchecked casts. NFC. llvm-svn: 268165
* [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.Craig Topper2016-04-302-13/+9
| | | | llvm-svn: 268164
* Add missing override.Rafael Espindola2016-04-301-1/+2
| | | | llvm-svn: 268163
* [ASan] Add shadow offset for SystemZ.Marcin Koscielnicki2016-04-301-2/+8
| | | | | | | | | | | | | | | | | | | | | | SystemZ on Linux currently has 53-bit address space. In theory, the hardware could support a full 64-bit address space, but that's not supported due to kernel limitations (it'd require 5-level page tables), and there are no plans for that. The default process layout stays within first 4TB of address space (to avoid creating 4-level page tables), so any offset >= (1 << 42) is fine. Let's use 1 << 52 here, ie. exactly half the address space. I've originally used 7 << 50 (uses top 1/8th of the address space), but ASan runtime assumes there's some space after the shadow area. While this is fixable, it's simpler to avoid the issue entirely. Also, I've originally wanted to have the shadow aligned to 1/8th the address space, so that we can use OR like X86 to assemble the offset. I no longer think it's a good idea, since using ADD enables us to load the constant just once and use it with register + register indexed addressing. Differential Revision: http://reviews.llvm.org/D19650 llvm-svn: 268161
* [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate ↵Simon Pilgrim2016-04-301-18/+20
| | | | | | | | elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268158
* AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaitsTom Stellard2016-04-301-6/+0
| | | | | | This was supposed to be part of r268143. llvm-svn: 268154
* [PowerPC/QPX] Fix the load/splat peephole with overlapping readsHal Finkel2016-04-301-1/+9
| | | | | | | | | | | If, in between the splat and the load (which does an implicit splat), there is a read of the splat register, then that register must have another earlier definition. In that case, we can't replace the load's destination register with the splat's destination register. Unfortunately, I don't have a small or non-fragile test case. llvm-svn: 268152
* Reverting 268054 & 268063 as they caused PR27579.Amjad Aboud2016-04-306-201/+52
| | | | llvm-svn: 268150
* [LowerGuardIntrinsics] Keep track of !make.implicit metadataSanjoy Das2016-04-301-0/+3
| | | | | | | | | | If a guard call being lowered by LowerGuardIntrinsics has the `!make.implicit` metadata attached, then reattach the metadata to the branch in the resulting expanded form of the intrinsic. This allows us to implement null checks as guards and still get the benefit of implicit null checks. llvm-svn: 268148
* Reroll loops with multiple IV and negative step part 3Lawrence Hu2016-04-301-9/+155
| | | | | | | | | | | | | | support multiple induction variables This patch enable loop reroll for the following case: for(int i=0; i<N; i += 2) { S += *a++; S += *a++; }; Differential Revision: http://reviews.llvm.org/D16550 llvm-svn: 268147
* AMDGPU/SI: Enable the post-ra schedulerTom Stellard2016-04-309-18/+324
| | | | | | | | | | | | | | Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 llvm-svn: 268143
* [LowerGuardIntrinsics] Preserve calling conv when loweringSanjoy Das2016-04-301-0/+2
| | | | llvm-svn: 268142
* Reapply r268107 after fixing a bug breaks debug build.Xinliang David Li2016-04-291-70/+80
| | | | | | Makes the new method to set data needed by debug dump. llvm-svn: 268130
* Mark guards on true as "trivially dead"Sanjoy Das2016-04-292-11/+8
| | | | | | | | | This moves some logic added to EarlyCSE in rL268120 into `llvm::isInstructionTriviallyDead`. Adds a test case for DCE to demonstrate that passes other than EarlyCSE can now pick up on the new information. llvm-svn: 268126
* clean up documentation comments; NFCSanjay Patel2016-04-291-110/+14
| | | | llvm-svn: 268122
* [MBP] Use Function::optForSize() instead of checking OptimizeForSize directly.Haicheng Wu2016-04-291-2/+1
| | | | | | Fix a FIXME. Disable loop alignment if compiled with -Oz now. llvm-svn: 268121
* [EarlyCSE] Simplify guard intrinsicsSanjoy Das2016-04-291-0/+23
| | | | | | | | | | | | | | | | | | Summary: This change teaches EarlyCSE some basic properties of guard intrinsics: - Guard intrinsics read all memory, but don't write to any memory - After a guard has executed, the condition it was guarding on can be assumed to be true - Guard intrinsics on a constant `true` are no-ops Reviewers: reames, hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19578 llvm-svn: 268120
* AMDGPU: Fix crash with unreachable terminators.Matt Arsenault2016-04-291-12/+27
| | | | | | | | | | If a block has no successors because it ends in unreachable, this was accessing an invalid iterator. Also stop counting instructions that don't emit any real instructions. llvm-svn: 268119
* Revert r268107 -- debug build failureXinliang David Li2016-04-291-78/+70
| | | | llvm-svn: 268116
* [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate ↵Simon Pilgrim2016-04-291-17/+23
| | | | | | | | elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268115
* [Orc] Add ORC lazy-compilation support for AArch64.Lang Hames2016-04-291-0/+144
| | | | | | | The ORC compile callbacks and indirect stubs APIs will now work for AArc64, allowing functions to be lazily compiled and/or updated. llvm-svn: 268112
* [ValueTracking] Make the code in lookThroughCastDavid Majnemer2016-04-291-16/+9
| | | | | | No functionality change is intended. llvm-svn: 268108
* [inliner]: Refactor inline deferring logic into its own method /NFCXinliang David Li2016-04-291-70/+78
| | | | | | | | The implemented heuristic has a large body of code which better sits in its own function for better readability. It also allows adding more heuristics easier in the future. llvm-svn: 268107
OpenPOWER on IntegriCloud