summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [SCEV] Consider delinearization pattern with extension with identity factorTobias Grosser2016-10-171-1/+2
| | | | | | | | | | | | Summary: The delinearization algorithm did not consider terms which had an extension without a multiply factor, i.e. a identify factor. We lose cases where size is char type where there will no multiply factor. Reviewers: sanjoy, grosser Subscribers: mzolotukhin, Eugene.Zelenko, llvm-commits, mssimpso, sanjoy, grosser Differential Revision: https://reviews.llvm.org/D16492 llvm-svn: 284378
* [CodeGenPrepare] When moving a zext near to its associated load, do not ↵Andrea Di Biagio2016-10-171-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | retain the original debug location. CodeGenPrepare knows how to move a zext of a load into the same basic block where the load lives. The goal is to help ISel match a zero-extending load instead of two separated instructions. CGP attempts to move a zext computation even if it lives in a basic block that does not post-dominate the load's basic block. That means, the hoisted zext may be speculated. Preserving the zext location would hurt the debugging experience and the quality of sample pgo. With this patch, when moving a zext near to its associated load, CGP no longer propagates the zext's debug location. Instead, CGP conservatively reuses the same debug location for the load and the zext. An alternative approach would be to assign an artificial line-0 location to the zext. However we don't want to over-use the 'line-0' for this particular case because it would have a size cost in the line-table section for no additional benefit. Differential Revision: https://reviews.llvm.org/D25611 llvm-svn: 284377
* [X86] Fix shuffle decoding assertions to print the right number of required ↵Craig Topper2016-10-171-8/+8
| | | | | | operands. Update the checks themselves to be >= to the same number instead of > one less than the required number. llvm-svn: 284365
* [AVX-512] Add shuffle combining support for vpermi2var shuffles derived from ↵Craig Topper2016-10-171-0/+14
| | | | | | existing support for vpermt2var. llvm-svn: 284357
* [AVX-512] Add support for turning a 256-bit load that goes to both halfs of ↵Craig Topper2016-10-162-13/+69
| | | | | | | | an insert_subvector into a subvector broadcast. Differential Revision: https://reviews.llvm.org/D25650 llvm-svn: 284353
* Support: Return void from Scanner::scan_ns_uri_char, no one uses the resultJustin Bogner2016-10-161-7/+2
| | | | | | | Simplify this a little bit since the result is never used. It can be added back easily enough if that changes. llvm-svn: 284348
* PR30711: Fix incorrect profiling of 'long long' in FoldingSet, then use it toRichard Smith2016-10-161-4/+4
| | | | | | fix TBAA violation in profiling of pointers. llvm-svn: 284336
* [AVX-512] Fix the operand order for vpermi2var_qi intrinsics to match the ↵Craig Topper2016-10-161-3/+3
| | | | | | other vpermi2var intrinsics. llvm-svn: 284329
* [AVX-512] Correct execution domain for VPERMT2PS and VPERMI2PS.Craig Topper2016-10-161-4/+4
| | | | llvm-svn: 284328
* [AVX-512] Move (v4i64 (X86SubVBroadcast (v2i64))) alternate patterns under a ↵Craig Topper2016-10-161-9/+9
| | | | | | HasVLX predicate. Similar for floating point. llvm-svn: 284327
* [ArmFastISel] Kill dead code. NFCI.Davide Italiano2016-10-161-35/+0
| | | | llvm-svn: 284320
* [MachineMemOperand] Move synchronization scope and atomic orderings from ↵Konstantin Zhuravlyov2016-10-158-81/+60
| | | | | | | | SDNode to MachineMemOperand, and remove redundant getAtomic* member functions from SelectionDAG. Differential Revision: https://reviews.llvm.org/D24577 llvm-svn: 284312
* [GVN/PRE] Hoist global values outside of loops.Davide Italiano2016-10-151-26/+58
| | | | | | | | | | | In theory this could be generalized to move anything where we prove the operands are available, but that would require rewriting PRE. As NewGVN will hopefully come soon, and we're trying to rewrite PRE in terms of NewGVN+MemorySSA, it's probably not worth spending too much time on it. Fix provided by Daniel Berlin! llvm-svn: 284311
* Test commit. (NFC)Li Huang2016-10-151-1/+1
| | | | llvm-svn: 284307
* [AVX-512] Add shuffle comments for vbroadcast instructions.Craig Topper2016-10-151-0/+43
| | | | llvm-svn: 284305
* [AVX-512] Rename VPBROADCASTI32X2 and VPBROADCASTF32X2 instruction classes ↵Craig Topper2016-10-151-4/+4
| | | | | | to match the mnemonic which does not include a 'P'. llvm-svn: 284304
* [SimplifyCFG] Use the error checking provided by getPrevNode.Benjamin Kramer2016-10-151-7/+11
| | | | | | | | | BasicBlock::size is O(insts), making this loop O(blocks*insts), which can be really slow on generated code. getPrevNode already checks if we're at the beginning of the block and returns nullptr if so, just use that instead. No functionality change intended. llvm-svn: 284303
* [libFuzzer] swap bytes in integers when handling CMP tracesKostya Serebryany2016-10-155-15/+49
| | | | llvm-svn: 284301
* [libFuzzer] better algorithm for -minimize_crashKostya Serebryany2016-10-153-5/+25
| | | | llvm-svn: 284299
* AMDGPU/SI: Handle s_getreg hazard in GCNHazardRecognizerTom Stellard2016-10-152-0/+49
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25526 llvm-svn: 284298
* [NFC] Loop Versioning for LICM code clean upEvgeny Astigeevich2016-10-141-31/+42
| | | | | | | | | | | | - Removed unused class members. - Made class internal data private. - Made class scoped data function scoped where it's possible. - Replace naked new/delete with unique_ptr. - Made resources guaranteed to be freed. Differential Revision: https://reviews.llvm.org/D25464 llvm-svn: 284290
* GlobalISel: rename legalizer components to match others.Tim Northover2016-10-1413-97/+95
| | | | | | | | | | The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287
* hardware_physical_concurrency() should return 1 when LLVM is built with ↵Mehdi Amini2016-10-141-0/+3
| | | | | | LLVM_ENABLE_THREADS=OFF llvm-svn: 284283
* [PPC] Shorter sequence to load 64bit constant with same hi/lo wordsGuozhi Wei2016-10-141-2/+23
| | | | | | | | | | | | This is a patch to implement pr30640. When a 64bit constant has the same hi/lo words, we can use rldimi to copy the low word into high word of the same register. This optimization caused failure of test case bperm.ll because of not optimal heuristic in function SelectAndParts64. It chooses AND or ROTATE to extract bit groups from a register, and OR them together. This optimization lowers the cost of loading 64bit constant mask used in AND method, and causes different code sequence. But actually ROTATE method is better in this test case. The reason is in ROTATE method the final OR operation can be avoided since rldimi can insert the rotated bits into target register directly. So this patch also enhances SelectAndParts64 to prefer ROTATE method when the two methods have same cost and there are multiple bit groups need to be ORed together. Differential Revision: https://reviews.llvm.org/D25521 llvm-svn: 284276
* [libFuzzer] remove subdir fuzzer-test-suite as it is now superseded with ↵Kostya Serebryany2016-10-1419-410/+0
| | | | | | https://github.com/google/fuzzer-test-suite llvm-svn: 284275
* [libFuzzer] add -trace_cmp=1 (guiding mutations based on the observed CMP ↵Kostya Serebryany2016-10-1412-12/+157
| | | | | | instructions). This is a reincarnation of the previously deleted -use_traces, but using a different approach for collecting traces. Still a toy, but at least it scales well. Also fix -merge in trace-pc-guard mode llvm-svn: 284273
* [DAG] avoid creating illegal node when transforming negated shifted sign bitSanjay Patel2016-10-141-2/+3
| | | | | | | | Eli noted this potential bug in the post-commit thread for: https://reviews.llvm.org/rL284239 ...but I'm not sure how to trigger it, so there's no test case yet. llvm-svn: 284268
* AMDGPU/SI: Use new SimplifyDemandedBits helper for multi-use operationsTom Stellard2016-10-141-13/+10
| | | | | | | | | | | | | Summary: We are using this helper for our 24-bit arithmetic combines, so we are now able to eliminate multi-use operations that mask the high-bits of 24-bit inputs (e.g. and x, 0xffffff) Reviewers: arsenm, nhaehnle Subscribers: tony-tye, arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24672 llvm-svn: 284267
* TargetLowering: Add SimplifyDemandedBits() helper to TargetLoweringOptTom Stellard2016-10-141-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The main purpose of this new helper is to enable simplifying operations that have multiple uses. SimplifyDemandedBits does not handle multiple uses currently, and this new function makes it possible to optimize: and v1, v0, 0xffffff mul24 v2, v1, v1 ; Multiply ignoring high 8-bits. To: mul24 v2, v0, v0 Where before this would not be optimized, because v1 has multiple uses. Reviewers: bogner, arsenm Subscribers: nhaehnle, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24964 llvm-svn: 284266
* The real fix for post-r284255 failuresKrzysztof Parzyszek2016-10-142-3/+2
| | | | llvm-svn: 284264
* Workaround to eliminate check-llvm failures after r284255Krzysztof Parzyszek2016-10-141-0/+1
| | | | llvm-svn: 284262
* Add a pass to optimize patterns of vectorized interleaved memory accesses forDavid L Kreitzer2016-10-145-0/+134
| | | | | | | | | | | | | X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 llvm-svn: 284260
* AMDGPU/SI: Don't allow unaligned scratch accessTom Stellard2016-10-144-0/+21
| | | | | | | | | | | | Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257
* [RDF] Switch RegisterRef to be a pair (Register, LaneMask)Krzysztof Parzyszek2016-10-148-255/+236
| | | | | | | | | Use PackedRegisterRef to store the register information in the graph nodes. This commit also removes support for virtual registers. It has never been tested or used. It will be possible to add it back if there is a need. llvm-svn: 284255
* [safestack] Use non-thread-local unsafe stack pointer for Contiki OSDavid L Kreitzer2016-10-143-50/+37
| | | | | | | | Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D19852 llvm-svn: 284254
* Revert "In preparation for removing getNameWithPrefix off ofEric Christopher2016-10-142-9/+8
| | | | | | | | | TargetMachine," as it's causing sanitizer/memory issues until I can track down this set. This reverts commit r284203 llvm-svn: 284252
* [Coverage] Support loading multiple binaries into a CoverageMappingVedant Kumar2016-10-141-16/+40
| | | | | | | | | | Add support for loading multiple coverage readers into a single CoverageMapping instance. This should make it easier to prepare a unified coverage report for multiple binaries. Differential Revision: https://reviews.llvm.org/D25535 llvm-svn: 284251
* Move alignTo computation inside the if.Rafael Espindola2016-10-141-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | This is an improvement when compiling with llvm. llvm doesn't inline the call to insert, so the align is always executed and shows up in the profile. With gcc the call to insert is inlined and the align computation moved and done only if needed. With this patch we explicitly only compute it if it is needed. In the two tests with debug info, the speedup was scylla master 3.008959365 patch 2.932080942 1.02621974786x faster firefox master 6.709823604 patch 6.592387227 1.01781393795x faster In all others the difference was in the noise. llvm-svn: 284249
* [X86] Take advantage of the lzcnt instruction on btver2 architectures when ↵Pierre Gousseau2016-10-146-0/+129
| | | | | | | | | | | | | | | | ORing comparisons to zero. This change adds transformations such as: zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0)))) To: srl(or(ctlz(x), ctlz(y)), log2(bitsize(x)) This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput. Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it. For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar. Differential Revision: https://reviews.llvm.org/D23446 llvm-svn: 284248
* [InstCombine] use m_APInt to allow sub with constant folds for splat vectorsSanjay Patel2016-10-141-18/+19
| | | | llvm-svn: 284247
* [InstCombine] sub X, sext(bool Y) -> add X, zext(bool Y)Sanjay Patel2016-10-141-0/+11
| | | | | | | | | | | | Prefer add/zext because they are better supported in terms of value-tracking. Note that the backend should be prepared for this IR canonicalization (including vector types) after: https://reviews.llvm.org/rL284015 Differential Revision: https://reviews.llvm.org/D25135 llvm-svn: 284241
* Define "contiki" OS specifier.David L Kreitzer2016-10-141-0/+2
| | | | | | | | Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24897 llvm-svn: 284240
* [DAG] add folds for negated shifted sign bitSanjay Patel2016-10-141-0/+13
| | | | | | | | | The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 llvm-svn: 284239
* AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodesNicolai Haehnle2016-10-141-10/+37
| | | | | | | | | | | | | | Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 llvm-svn: 284224
* Fix use-after-freesNicolai Haehnle2016-10-141-2/+2
| | | | | | Extracted from D25313, as suggested by Justin Bogner. llvm-svn: 284220
* [mips] Fix aui/daui/dahi/dati for MIPSR6Simon Dardis2016-10-147-17/+52
| | | | | | | | | | | | For compatiblity with binutils, define these instructions to take two registers with a 16bit unsigned immediate. Both of the registers have to be same for dahi and dati. Reviewers: dsanders, zoran.jovanovic Differential Review: https://reviews.llvm.org/D21473 llvm-svn: 284218
* AMDGPU: Fix use-after-freesNicolai Haehnle2016-10-142-15/+16
| | | | | | | | | | Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25312 llvm-svn: 284215
* [x86][ms-inline-asm] use of "jmp short" in asm is not supportedMichael Zuckerman2016-10-141-0/+14
| | | | | | | | | | | | | | | | | | Committing in the name of Ziv Izhar: After check-all and LGTM . The following patch is for compatability with Microsoft. Microsoft ignores the keyword "short" when used after a jmp, for example: __asm { jmp short label label: } A test for that patch will be added in another patch, since it's located in clang's codegen tests. Link will be added shortly. link to test: https://reviews.llvm.org/D24958 Differential Revision: https://reviews.llvm.org/D24957 llvm-svn: 284211
* [DAGCombiner] Teach createBuildVecShuffle to handle cases where input ↵Craig Topper2016-10-141-5/+9
| | | | | | | | vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. llvm-svn: 284204
* In preparation for removing getNameWithPrefix off of TargetMachine,Eric Christopher2016-10-142-8/+9
| | | | | | | sink the current behavior into the callers and sink TargetMachine::getNameWithPrefix into TargetMachine::getSymbol. llvm-svn: 284203
OpenPOWER on IntegriCloud