summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Interprocedural Register Allocation (IPRA): add a Transformation PassMehdi Amini2016-06-103-0/+135
| | | | | | | | | | | | Adds a MachineFunctionPass that scans the body to find calls, and update the register mask with the one saved by the RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D21180 llvm-svn: 272414
* Add a period. NFC.Chad Rosier2016-06-101-1/+1
| | | | llvm-svn: 272410
* Fix whitespace. NFC.Chad Rosier2016-06-101-1/+1
| | | | llvm-svn: 272409
* [X86] Add costs for SSE zext/sext to v4i64 to TTIMichael Kuperstein2016-06-101-0/+14
| | | | | | | | | The costs are somewhat hand-wavy, but should be much closer to the truth than what we get from BasicTTI. Differential Revision: http://reviews.llvm.org/D21156 llvm-svn: 272406
* Interprocedural Register Allocation (IPRA) AnalysisMehdi Amini2016-06-105-0/+244
| | | | | | | | | | | | | | | | | | | | | | | Add an option to enable the analysis of MachineFunction register usage to extract the list of clobbered registers. When enabled, the CodeGen order is changed to be bottom up on the Call Graph. The analysis is split in two parts, RegUsageInfoCollector is the MachineFunction Pass that runs post-RA and collect the list of clobbered registers to produce a register mask. An immutable pass, RegisterUsageInfo, stores the RegMask produced by RegUsageInfoCollector, and keep them available. A future tranformation pass will use this information to update every call-sites after instruction selection. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D20769 llvm-svn: 272403
* [AArch64] Add preferred alignments for Exynos M1Evandro Menezes2016-06-103-2/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D21203 llvm-svn: 272400
* [Hexagon] Remove incorrect offset scalingKrzysztof Parzyszek2016-06-101-4/+2
| | | | llvm-svn: 272399
* Test commitRoman Shirokiy2016-06-101-3/+3
| | | | llvm-svn: 272393
* [AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning ↵Sam Kolton2016-06-105-257/+348
| | | | | | | | | | | | | | | | | | in AMDGPUOperand. Summary: sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported. Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier. Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers. Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...). Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20968 llvm-svn: 272384
* test commit: remove trailing whitespaces in README.txtRoger Ferrer Ibanez2016-06-101-8/+8
| | | | llvm-svn: 272380
* Bug fix remove another illegal char from prof symbol nameXinliang David Li2016-06-101-1/+1
| | | | | | | | End-end test with no integrated assembly should be added at some point (not done now because some bots are not properly configured to support -no-integrated-as) llvm-svn: 272376
* [LibFuzzer] Fix some unit test crashes on OSX.Dan Liew2016-06-101-0/+4
| | | | | | | | | | | | | | | | | | | | | This fixes the following unit tests: FuzzerDictionary.ParseOneDictionaryEntry FuzzerDictionary.ParseDictionaryFile The issue appears to be mixing non-ASan-ified code (LibFuzzer) and ASan-ified code (the unittest) as the tests would pass fine if everything was built with ASan enabled. I believe the issue is that different implementations of std::vector<> are being used in LibFuzzer and outside LibFuzzer (in the unittests). For Libcxx (I've not seen the issue manifest for libstdc++) we can disable the ASanified std::vector<> by definining the ``_LIBCPP_HAS_NO_ASAN`` macro. Doing this fixes the tests on OSX. Differential Revision: http://reviews.llvm.org/D21049 llvm-svn: 272374
* [AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ.Craig Topper2016-06-101-1/+9
| | | | llvm-svn: 272371
* Make PDBFile take a StreamInterface instead of a MemBuffer.Zachary Turner2016-06-102-118/+79
| | | | | | | | | | | | | | | This is the next step towards being able to write PDBs. MemoryBuffer is immutable, and StreamInterface is our replacement which can be any combination of read-only, read-write, or write-only depending on the particular implementation. The one place where we were creating a PDBFile (in RawSession) is updated to subclass ByteStream with a simple adapter that holds a MemoryBuffer, and initializes the superclass with the buffer's array, so that all the functionality of ByteStream works transparently. llvm-svn: 272370
* Add support for writing through StreamInterface.Zachary Turner2016-06-109-17/+302
| | | | | | | | | | | This adds method and tests for writing to a PDB stream. With this, even a PDB stream which is discontiguous can be treated as a sequential stream of bytes for the purposes of writing. Reviewed By: ruiu Differential Revision: http://reviews.llvm.org/D21157 llvm-svn: 272369
* [AVX512] Fix shuffle comment printing to handle the masked versions of some ↵Craig Topper2016-06-101-30/+46
| | | | | | shuffles. Previously we were printing the mask operands as the register names. llvm-svn: 272367
* AMDGPU: Fix trailing whitespaceMatt Arsenault2016-06-1010-40/+40
| | | | llvm-svn: 272364
* [esan|cfrag] Add the struct field offset array in StructInfoQin Zhao2016-06-101-11/+29
| | | | | | | | | | | | | | | Summary: Adds the struct field offset array in struct StructInfo. Updates test struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21192 llvm-svn: 272362
* Add null checks before using a pointer.Richard Trieu2016-06-101-0/+4
| | | | llvm-svn: 272359
* [esan|cfrag] Disable load/store instrumentation for cfragQin Zhao2016-06-101-3/+7
| | | | | | | | | | | | | | | | | | | | Summary: Adds ClInstrumentFastpath option to control fastpath instrumentation. Avoids the load/store instrumentation for the cache fragmentation tool. Renames cache_frag_basic.ll to working_set_slow.ll for slowpath instrumentation test. Adds the __esan_init check in struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21079 llvm-svn: 272355
* AMDGPU: v_cndmask_b32 does not def vccMatt Arsenault2016-06-101-2/+2
| | | | | | Fixes verifier errors after SIShrinkInstructions. llvm-svn: 272351
* AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permuteTom Stellard2016-06-101-2/+2
| | | | | | | | | | | | | | | Summary: This fixes a bug with ds_*permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 llvm-svn: 272349
* AMDGPU/SI: Use common topological sort algorithm in SIScheduleDAGMITom Stellard2016-06-091-64/+3
| | | | | | | | | | Reviewers: arsenm, axeldavy Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19823 llvm-svn: 272346
* AMDGPU: Fix flat atomicsMatt Arsenault2016-06-094-41/+90
| | | | | | | | The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. llvm-svn: 272345
* AMDGPU: Fix i64 global cmpxchgMatt Arsenault2016-06-094-40/+82
| | | | | | | | | | This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344
* Make sure that not interesting allocas are not instrumented.Vitaly Buka2016-06-091-4/+13
| | | | | | | | | | | | | | | | | | | Summary: We failed to unpoison uninteresting allocas on return as unpoisoning is part of main instrumentation which skips such allocas. Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of dynamic allocas is disabled it will not will not be unpoisoned. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21207 llvm-svn: 272341
* CodeGen: Allow verifier to run after MachineBlockPlacementMatt Arsenault2016-06-091-1/+1
| | | | | | No tests break with this enabled. llvm-svn: 272340
* Add aliases for mfvrsave/mtvrsave.Eric Christopher2016-06-091-0/+4
| | | | | | | Update a test as we're now going to emit it for easier reading of generated assembly as well. llvm-svn: 272339
* AMDGPU: Run verifer after insert waits passMatt Arsenault2016-06-091-1/+1
| | | | llvm-svn: 272338
* AMDGPU: Remove incorrect assertionMatt Arsenault2016-06-091-4/+0
| | | | | | | I'm still not sure under what circumstances the offset here is non-0, but private memory is not limited to 27-bits. llvm-svn: 272337
* AMDGPU: Properly initialize SIShrinkInstructionsMatt Arsenault2016-06-093-8/+6
| | | | llvm-svn: 272336
* [CFLAA] Handle global/arg attrs more sanely.George Burgess IV2016-06-092-20/+40
| | | | | | | | | | | | | | | | Prior to this patch, we used argument/global stratified attributes in order to note that a value could have come from either dereferencing a global/arg, or from the assignment from a global/arg. Now, AttrUnknown is placed on sets when we see a dereference, instead of the global/arg attributes. This allows us to be more aggressive in the future when we see global/arg attributes without AttrUnknown. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21110 llvm-svn: 272335
* Unpoison stack memory in use-after-return + use-after-scope modeVitaly Buka2016-06-091-12/+21
| | | | | | | | | | | | | | | Summary: We still want to unpoison full stack even in use-after-return as it can be disabled at runtime. PR27453 Reviewers: eugenis, kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21202 llvm-svn: 272334
* Reapply 272328 and 272329 as a single patch.Alina Sbirlea2016-06-091-10/+3
| | | | | | | | | | | | | | | | | | | | [cpu-detection] [amdfam10] Return barcelona, and amdfam10 for all other subtypes. Address Bug 28067. Along with the refactoring of Host.cpp, getHostCPUName() was modified to return more precise types for CPUs in amdfam10. However, callers of getHostCPUName() do string matching on type, so this cannot be modified. Currently there is support in the x86 backend for barcelona. For all other subtypes the assumed return value is amdfam10. Fix: getHostCPUName() returns barcelona subtype and amdfam10 for all others. This can be extended further when support for the other subtypes is added. Differential revision: http://reviews.llvm.org/D21193 llvm-svn: 272333
* Revert 272328 and 272329 to recommit as a single patch.Alina Sbirlea2016-06-091-3/+10
| | | | llvm-svn: 272332
* Keep barcelona subtype for amdfam10Alina Sbirlea2016-06-091-1/+3
| | | | llvm-svn: 272329
* [cpu-detection] Return amdfam10 for all subtypes. Address Bug 28067.Alina Sbirlea2016-06-091-9/+0
| | | | | | | | | | | | Summary: Remove architecture subtype from the string returned by getHostCPUName(). String matching done on type. Reviewers: llvm-commits, echristo Subscribers: mehdi_amini Differential Revision: http://reviews.llvm.org/D21193 llvm-svn: 272328
* Use ProfileSummaryInfo in inline cost analysis.Easwaran Raman2016-06-094-40/+36
| | | | | | | | Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321
* [X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction commentsSimon Pilgrim2016-06-091-0/+12
| | | | llvm-svn: 272319
* [LiveRangeEdit] Fix a crash in eliminateDeadDef.Quentin Colombet2016-06-091-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we delete a live-range, we check if that live-range is the origin of others to keep it around for rematerialization. For that we check that the instruction we are about to remove is the same as the definition of the VNI of the original live-range. If this is the case, we just shrink the live-range to an empty one. Now, when we try to delete one of the children of such live-range (product of splitting), we do the same check. However, now the original live-range is empty and there is no way we can access the VNI to check its definition, and we crash. When we cannot get the VNI for the original live-range, that means we are not in the presence of the original definition. Thus, this check does not need to happen in that case and the crash is sloved! This bug was introduced in r266162 | wmi | 2016-04-12 20:08:27. It affects every target that uses the greedy register allocator. To happen, we need to delete both a the original instruction and its split products, in that order. This is likely to happen when rematerialization comes into play. Trying to produce a more robust test case. Will follow in a coming commit. This fixes llvm.org/PR27983. rdar://problem/26651519 llvm-svn: 272314
* [X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsicsSimon Pilgrim2016-06-092-12/+14
| | | | | | Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ llvm-svn: 272308
* [X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNRSimon Pilgrim2016-06-091-1/+1
| | | | llvm-svn: 272307
* BitcodeReader: Use std:::piecewise_construct when upgrading type refsDuncan P. N. Exon Smith2016-06-091-3/+3
| | | | | | | | | | | | | | r267296 used std::piecewise_construct without using std::forward_as_tuple, and r267298 hacked it out (using an emplace_back followed by a couple of reset() calls) because of a problem on a bot. I'm finally circling back to call forward_as_tuple as I should have to begin with (thanks to David Blaikie for pointing out the missing piece). Note that this code uses emplace_back() instead of push_back(make_pair()) because the move constructor for TrackingMDRef is expensive (cheaper than a copy, but still expensive). llvm-svn: 272306
* [X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte ↵Simon Pilgrim2016-06-091-19/+41
| | | | | | | | shifts 512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget llvm-svn: 272300
* [NVPTX] Add intrinsics for shfl instructions.Justin Lebar2016-06-091-1/+42
| | | | | | | | | | | | | | | Summary: Currently clang emits these instructions via inline (volatile) asm in the CUDA headers. Switching to intrinsics will let the optimizer reason across calls to these intrinsics. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D21160 llvm-svn: 272298
* [PM] Port LCSSA to the new PM.Easwaran Raman2016-06-098-23/+47
| | | | | | Differential Revision: http://reviews.llvm.org/D21090 llvm-svn: 272294
* AMDGPU/SI: Fix 32-bit fdiv loweringWei Ding2016-06-091-16/+53
| | | | | | | | | We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 llvm-svn: 272292
* [LV] Use vector phis for some secondary induction variablesMichael Kuperstein2016-06-091-4/+6
| | | | | | | | | | | | | | Previously, we materialized secondary vector IVs from the primary scalar IV, by offseting the primary to match the correct start value, and then broadcasting it - inside the loop body. Instead, we can use a real vector IV, like we do for the primary. This enables using vector IVs for secondary integer IVs whose type matches the type of the primary. Differential Revision: http://reviews.llvm.org/D20932 llvm-svn: 272283
* SelectionDAG: Implement expansion of {S,U}MIN/MAX in integer legalizationJan Vesely2016-06-092-0/+55
| | | | | | | | | Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D17898 llvm-svn: 272272
* Reapply "[MBP] Reduce code size by running tail merging in MBP.""Haicheng Wu2016-06-094-34/+121
| | | | | | | | | | | | | | | | This reapplies commit r271930, r271915, r271923. They hit a bug in Thumb which is fixed in r272258 now. The original message: The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. llvm-svn: 272267
OpenPOWER on IntegriCloud