summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* getelementptr instruction, support index vector of EVT.Igor Breger2016-05-011-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195
* Change AVX512 braodcastsd/ss patterns interaction with spilling . New ↵Igor Breger2016-05-013-110/+98
| | | | | | | | implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190
* [AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when ↵Craig Topper2016-05-011-3/+3
| | | | | | VLX and BWI are supported. llvm-svn: 268189
* [AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB ↵Craig Topper2016-05-011-13/+14
| | | | | | and VPMADDUBSW/VPMADDWD. llvm-svn: 268188
* [AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also ↵Craig Topper2016-05-011-16/+16
| | | | | | marked as requiring VLX. llvm-svn: 268186
* [X86] Add an AddedComplexity to another pattern to put it near similar in ↵Craig Topper2016-05-011-2/+1
| | | | | | the output file. llvm-svn: 268184
* [X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere ↵Craig Topper2016-05-011-2/+0
| | | | | | with an AddedComplexity that made this unreachable. llvm-svn: 268183
* [X86] Add AddedComplexity to keep some similar patterns near each other in ↵Craig Topper2016-05-011-0/+1
| | | | | | the output file. llvm-svn: 268181
* [X86] Remove some redundant selection patterns.Craig Topper2016-05-012-11/+0
| | | | llvm-svn: 268180
* [AVX512] Replace vector_extract with extractelt in some patterns. They mean ↵Craig Topper2016-05-011-5/+5
| | | | | | the same thing but vector_extract is deprecated. NFC llvm-svn: 268179
* [SCEV] When printing via -analysis, dump loop dispositionSanjoy Das2016-05-011-0/+25
| | | | | | | | | | | There are currently some bugs in tree around SCEV caching an incorrect loop disposition. Printing out loop dispositions will let us write whitebox tests as those are fixed. The dispositions are printed as a list in "inside out" order, i.e. innermost loop first. llvm-svn: 268177
* Properly name LLVMSetIsInBounds's argument. NFCAmaury Sechet2016-05-011-2/+2
| | | | llvm-svn: 268176
* [AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions.Craig Topper2016-05-011-4/+7
| | | | llvm-svn: 268174
* [ORC] Save AArch64 NEON state in the JIT reentry block.Lang Hames2016-05-011-42/+74
| | | | | | | The earlier version of the resolver code did not save NEON state, so it would have broken any callees that used floating point. llvm-svn: 268173
* CodeGen: convert to range based loopsSaleem Abdulrasool2016-04-301-36/+20
| | | | | | | Convert to using some range based loops, avoid unnecessary variables for unchecked casts. NFC. llvm-svn: 268165
* [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.Craig Topper2016-04-302-13/+9
| | | | llvm-svn: 268164
* Add missing override.Rafael Espindola2016-04-301-1/+2
| | | | llvm-svn: 268163
* [ASan] Add shadow offset for SystemZ.Marcin Koscielnicki2016-04-301-2/+8
| | | | | | | | | | | | | | | | | | | | | | SystemZ on Linux currently has 53-bit address space. In theory, the hardware could support a full 64-bit address space, but that's not supported due to kernel limitations (it'd require 5-level page tables), and there are no plans for that. The default process layout stays within first 4TB of address space (to avoid creating 4-level page tables), so any offset >= (1 << 42) is fine. Let's use 1 << 52 here, ie. exactly half the address space. I've originally used 7 << 50 (uses top 1/8th of the address space), but ASan runtime assumes there's some space after the shadow area. While this is fixable, it's simpler to avoid the issue entirely. Also, I've originally wanted to have the shadow aligned to 1/8th the address space, so that we can use OR like X86 to assemble the offset. I no longer think it's a good idea, since using ADD enables us to load the constant just once and use it with register + register indexed addressing. Differential Revision: http://reviews.llvm.org/D19650 llvm-svn: 268161
* [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate ↵Simon Pilgrim2016-04-301-18/+20
| | | | | | | | elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268158
* AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaitsTom Stellard2016-04-301-6/+0
| | | | | | This was supposed to be part of r268143. llvm-svn: 268154
* [PowerPC/QPX] Fix the load/splat peephole with overlapping readsHal Finkel2016-04-301-1/+9
| | | | | | | | | | | If, in between the splat and the load (which does an implicit splat), there is a read of the splat register, then that register must have another earlier definition. In that case, we can't replace the load's destination register with the splat's destination register. Unfortunately, I don't have a small or non-fragile test case. llvm-svn: 268152
* Reverting 268054 & 268063 as they caused PR27579.Amjad Aboud2016-04-306-201/+52
| | | | llvm-svn: 268150
* [LowerGuardIntrinsics] Keep track of !make.implicit metadataSanjoy Das2016-04-301-0/+3
| | | | | | | | | | If a guard call being lowered by LowerGuardIntrinsics has the `!make.implicit` metadata attached, then reattach the metadata to the branch in the resulting expanded form of the intrinsic. This allows us to implement null checks as guards and still get the benefit of implicit null checks. llvm-svn: 268148
* Reroll loops with multiple IV and negative step part 3Lawrence Hu2016-04-301-9/+155
| | | | | | | | | | | | | | support multiple induction variables This patch enable loop reroll for the following case: for(int i=0; i<N; i += 2) { S += *a++; S += *a++; }; Differential Revision: http://reviews.llvm.org/D16550 llvm-svn: 268147
* AMDGPU/SI: Enable the post-ra schedulerTom Stellard2016-04-309-18/+324
| | | | | | | | | | | | | | Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 llvm-svn: 268143
* [LowerGuardIntrinsics] Preserve calling conv when loweringSanjoy Das2016-04-301-0/+2
| | | | llvm-svn: 268142
* Reapply r268107 after fixing a bug breaks debug build.Xinliang David Li2016-04-291-70/+80
| | | | | | Makes the new method to set data needed by debug dump. llvm-svn: 268130
* Mark guards on true as "trivially dead"Sanjoy Das2016-04-292-11/+8
| | | | | | | | | This moves some logic added to EarlyCSE in rL268120 into `llvm::isInstructionTriviallyDead`. Adds a test case for DCE to demonstrate that passes other than EarlyCSE can now pick up on the new information. llvm-svn: 268126
* clean up documentation comments; NFCSanjay Patel2016-04-291-110/+14
| | | | llvm-svn: 268122
* [MBP] Use Function::optForSize() instead of checking OptimizeForSize directly.Haicheng Wu2016-04-291-2/+1
| | | | | | Fix a FIXME. Disable loop alignment if compiled with -Oz now. llvm-svn: 268121
* [EarlyCSE] Simplify guard intrinsicsSanjoy Das2016-04-291-0/+23
| | | | | | | | | | | | | | | | | | Summary: This change teaches EarlyCSE some basic properties of guard intrinsics: - Guard intrinsics read all memory, but don't write to any memory - After a guard has executed, the condition it was guarding on can be assumed to be true - Guard intrinsics on a constant `true` are no-ops Reviewers: reames, hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19578 llvm-svn: 268120
* AMDGPU: Fix crash with unreachable terminators.Matt Arsenault2016-04-291-12/+27
| | | | | | | | | | If a block has no successors because it ends in unreachable, this was accessing an invalid iterator. Also stop counting instructions that don't emit any real instructions. llvm-svn: 268119
* Revert r268107 -- debug build failureXinliang David Li2016-04-291-78/+70
| | | | llvm-svn: 268116
* [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate ↵Simon Pilgrim2016-04-291-17/+23
| | | | | | | | elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268115
* [Orc] Add ORC lazy-compilation support for AArch64.Lang Hames2016-04-291-0/+144
| | | | | | | The ORC compile callbacks and indirect stubs APIs will now work for AArc64, allowing functions to be lazily compiled and/or updated. llvm-svn: 268112
* [ValueTracking] Make the code in lookThroughCastDavid Majnemer2016-04-291-16/+9
| | | | | | No functionality change is intended. llvm-svn: 268108
* [inliner]: Refactor inline deferring logic into its own method /NFCXinliang David Li2016-04-291-70/+78
| | | | | | | | The implemented heuristic has a large body of code which better sits in its own function for better readability. It also allows adding more heuristics easier in the future. llvm-svn: 268107
* Differential Revision: http://reviews.llvm.org/D19733Sriraman Tallam2016-04-293-4/+3
| | | | llvm-svn: 268106
* AMDGPU: Add kernarg.segment.ptr intrinsicMatt Arsenault2016-04-291-0/+5
| | | | llvm-svn: 268105
* [InstCombine] Determine the result of a select based on a dominating condition.Chad Rosier2016-04-292-1/+22
| | | | | | Differential Revision: http://reviews.llvm.org/D19550 llvm-svn: 268104
* [InstCombine] clean up; NFCSanjay Patel2016-04-291-1/+1
| | | | llvm-svn: 268099
* AMDGPU/SI: Move post regalloc run of SIShrinkInstructionsMatt Arsenault2016-04-291-5/+1
| | | | | | | | Move to addPreEmitPass. This is so it runs after post-RA scheduling so we can merge s_nops emitted by the scheduler and hazard recognizer. llvm-svn: 268095
* DAGCombiner: Reduce truncated shl widthMatt Arsenault2016-04-291-0/+19
| | | | llvm-svn: 268094
* Move coverage related code into a separate library.Easwaran Raman2016-04-297-8/+44
| | | | | | Differential Revision: http://reviews.llvm.org/D19333 llvm-svn: 268089
* [libFuzzer] enable detect_leaks=1, add proper docsKostya Serebryany2016-04-293-3/+3
| | | | llvm-svn: 268088
* [MemorySSA] Fix bugs in walker; refactor unittests a bit.George Burgess IV2016-04-291-8/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes two somewhat related bugs in MemorySSA's caching walker. These bugs were found because D19695 brought up the problem that we'd have defs cached to themselves, which is incorrect. The bugs this fixes are: - We would sometimes skip the nearest clobber of a MemoryAccess, because we would query our cache for a given potential clobber before checking if the potential clobber is the clobber we're looking for. The cache entry for the potential clobber would point to the nearest clobber *of the potential clobber*, so if that was a cache hit, we'd ignore the potential clobber entirely. - There are times (sometimes in DFS, sometimes in the getClobbering... functions) where we would insert cache entries that say a def clobbers itself. There's a bit of common code between the fixes for the bugs, so they aren't split out into multiple commits. This patch also adds a few unit tests, and refactors existing tests a bit to reduce the duplication of setup code. llvm-svn: 268087
* [ValueTracking] matchSelectPattern needs to be more careful around FPDavid Majnemer2016-04-291-19/+31
| | | | | | | | | | | | matchSelectPattern attempts to see through casts which mask min/max patterns from being more obvious. Under certain circumstances, it would misidentify a sequence of instructions as a min/max because it assumed that folding casts would preserve the result. This is not the case for floating point <-> integer casts. This fixes PR27575. llvm-svn: 268086
* Fix crash in PDB when loading corrupt file.Zachary Turner2016-04-291-0/+7
| | | | | | | | | | There are probably hundreds of crashers we can find by fuzzing more. For now we do the simplest possible validation of the block size. Later, more complicated validations can verify that other fields of the super block such as directory size, number of blocks, agree with the size of the file etc. llvm-svn: 268084
* Use SelectionDAG::getTargetConstant* helper functions. NFC.Simon Pilgrim2016-04-291-4/+4
| | | | | | Instead of SelectionDAG::getConstant directly to make it more obvious that we're creating target constants. llvm-svn: 268074
* Put PDB parsing code into a pdb namespace.Zachary Turner2016-04-2911-62/+70
| | | | llvm-svn: 268072
OpenPOWER on IntegriCloud