summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [AVX-512] Add shuffle comments for vbroadcast instructions.Craig Topper2016-10-151-0/+43
| | | | llvm-svn: 284305
* [AVX-512] Rename VPBROADCASTI32X2 and VPBROADCASTF32X2 instruction classes ↵Craig Topper2016-10-151-4/+4
| | | | | | to match the mnemonic which does not include a 'P'. llvm-svn: 284304
* [SimplifyCFG] Use the error checking provided by getPrevNode.Benjamin Kramer2016-10-151-7/+11
| | | | | | | | | BasicBlock::size is O(insts), making this loop O(blocks*insts), which can be really slow on generated code. getPrevNode already checks if we're at the beginning of the block and returns nullptr if so, just use that instead. No functionality change intended. llvm-svn: 284303
* [libFuzzer] swap bytes in integers when handling CMP tracesKostya Serebryany2016-10-155-15/+49
| | | | llvm-svn: 284301
* [libFuzzer] better algorithm for -minimize_crashKostya Serebryany2016-10-153-5/+25
| | | | llvm-svn: 284299
* AMDGPU/SI: Handle s_getreg hazard in GCNHazardRecognizerTom Stellard2016-10-152-0/+49
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25526 llvm-svn: 284298
* [NFC] Loop Versioning for LICM code clean upEvgeny Astigeevich2016-10-141-31/+42
| | | | | | | | | | | | - Removed unused class members. - Made class internal data private. - Made class scoped data function scoped where it's possible. - Replace naked new/delete with unique_ptr. - Made resources guaranteed to be freed. Differential Revision: https://reviews.llvm.org/D25464 llvm-svn: 284290
* GlobalISel: rename legalizer components to match others.Tim Northover2016-10-1413-97/+95
| | | | | | | | | | The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287
* hardware_physical_concurrency() should return 1 when LLVM is built with ↵Mehdi Amini2016-10-141-0/+3
| | | | | | LLVM_ENABLE_THREADS=OFF llvm-svn: 284283
* [PPC] Shorter sequence to load 64bit constant with same hi/lo wordsGuozhi Wei2016-10-141-2/+23
| | | | | | | | | | | | This is a patch to implement pr30640. When a 64bit constant has the same hi/lo words, we can use rldimi to copy the low word into high word of the same register. This optimization caused failure of test case bperm.ll because of not optimal heuristic in function SelectAndParts64. It chooses AND or ROTATE to extract bit groups from a register, and OR them together. This optimization lowers the cost of loading 64bit constant mask used in AND method, and causes different code sequence. But actually ROTATE method is better in this test case. The reason is in ROTATE method the final OR operation can be avoided since rldimi can insert the rotated bits into target register directly. So this patch also enhances SelectAndParts64 to prefer ROTATE method when the two methods have same cost and there are multiple bit groups need to be ORed together. Differential Revision: https://reviews.llvm.org/D25521 llvm-svn: 284276
* [libFuzzer] remove subdir fuzzer-test-suite as it is now superseded with ↵Kostya Serebryany2016-10-1419-410/+0
| | | | | | https://github.com/google/fuzzer-test-suite llvm-svn: 284275
* [libFuzzer] add -trace_cmp=1 (guiding mutations based on the observed CMP ↵Kostya Serebryany2016-10-1412-12/+157
| | | | | | instructions). This is a reincarnation of the previously deleted -use_traces, but using a different approach for collecting traces. Still a toy, but at least it scales well. Also fix -merge in trace-pc-guard mode llvm-svn: 284273
* [DAG] avoid creating illegal node when transforming negated shifted sign bitSanjay Patel2016-10-141-2/+3
| | | | | | | | Eli noted this potential bug in the post-commit thread for: https://reviews.llvm.org/rL284239 ...but I'm not sure how to trigger it, so there's no test case yet. llvm-svn: 284268
* AMDGPU/SI: Use new SimplifyDemandedBits helper for multi-use operationsTom Stellard2016-10-141-13/+10
| | | | | | | | | | | | | Summary: We are using this helper for our 24-bit arithmetic combines, so we are now able to eliminate multi-use operations that mask the high-bits of 24-bit inputs (e.g. and x, 0xffffff) Reviewers: arsenm, nhaehnle Subscribers: tony-tye, arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24672 llvm-svn: 284267
* TargetLowering: Add SimplifyDemandedBits() helper to TargetLoweringOptTom Stellard2016-10-141-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The main purpose of this new helper is to enable simplifying operations that have multiple uses. SimplifyDemandedBits does not handle multiple uses currently, and this new function makes it possible to optimize: and v1, v0, 0xffffff mul24 v2, v1, v1 ; Multiply ignoring high 8-bits. To: mul24 v2, v0, v0 Where before this would not be optimized, because v1 has multiple uses. Reviewers: bogner, arsenm Subscribers: nhaehnle, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24964 llvm-svn: 284266
* The real fix for post-r284255 failuresKrzysztof Parzyszek2016-10-142-3/+2
| | | | llvm-svn: 284264
* Workaround to eliminate check-llvm failures after r284255Krzysztof Parzyszek2016-10-141-0/+1
| | | | llvm-svn: 284262
* Add a pass to optimize patterns of vectorized interleaved memory accesses forDavid L Kreitzer2016-10-145-0/+134
| | | | | | | | | | | | | X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 llvm-svn: 284260
* AMDGPU/SI: Don't allow unaligned scratch accessTom Stellard2016-10-144-0/+21
| | | | | | | | | | | | Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257
* [RDF] Switch RegisterRef to be a pair (Register, LaneMask)Krzysztof Parzyszek2016-10-148-255/+236
| | | | | | | | | Use PackedRegisterRef to store the register information in the graph nodes. This commit also removes support for virtual registers. It has never been tested or used. It will be possible to add it back if there is a need. llvm-svn: 284255
* [safestack] Use non-thread-local unsafe stack pointer for Contiki OSDavid L Kreitzer2016-10-143-50/+37
| | | | | | | | Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D19852 llvm-svn: 284254
* Revert "In preparation for removing getNameWithPrefix off ofEric Christopher2016-10-142-9/+8
| | | | | | | | | TargetMachine," as it's causing sanitizer/memory issues until I can track down this set. This reverts commit r284203 llvm-svn: 284252
* [Coverage] Support loading multiple binaries into a CoverageMappingVedant Kumar2016-10-141-16/+40
| | | | | | | | | | Add support for loading multiple coverage readers into a single CoverageMapping instance. This should make it easier to prepare a unified coverage report for multiple binaries. Differential Revision: https://reviews.llvm.org/D25535 llvm-svn: 284251
* Move alignTo computation inside the if.Rafael Espindola2016-10-141-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | This is an improvement when compiling with llvm. llvm doesn't inline the call to insert, so the align is always executed and shows up in the profile. With gcc the call to insert is inlined and the align computation moved and done only if needed. With this patch we explicitly only compute it if it is needed. In the two tests with debug info, the speedup was scylla master 3.008959365 patch 2.932080942 1.02621974786x faster firefox master 6.709823604 patch 6.592387227 1.01781393795x faster In all others the difference was in the noise. llvm-svn: 284249
* [X86] Take advantage of the lzcnt instruction on btver2 architectures when ↵Pierre Gousseau2016-10-146-0/+129
| | | | | | | | | | | | | | | | ORing comparisons to zero. This change adds transformations such as: zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0)))) To: srl(or(ctlz(x), ctlz(y)), log2(bitsize(x)) This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput. Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it. For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar. Differential Revision: https://reviews.llvm.org/D23446 llvm-svn: 284248
* [InstCombine] use m_APInt to allow sub with constant folds for splat vectorsSanjay Patel2016-10-141-18/+19
| | | | llvm-svn: 284247
* [InstCombine] sub X, sext(bool Y) -> add X, zext(bool Y)Sanjay Patel2016-10-141-0/+11
| | | | | | | | | | | | Prefer add/zext because they are better supported in terms of value-tracking. Note that the backend should be prepared for this IR canonicalization (including vector types) after: https://reviews.llvm.org/rL284015 Differential Revision: https://reviews.llvm.org/D25135 llvm-svn: 284241
* Define "contiki" OS specifier.David L Kreitzer2016-10-141-0/+2
| | | | | | | | Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24897 llvm-svn: 284240
* [DAG] add folds for negated shifted sign bitSanjay Patel2016-10-141-0/+13
| | | | | | | | | The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 llvm-svn: 284239
* AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodesNicolai Haehnle2016-10-141-10/+37
| | | | | | | | | | | | | | Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 llvm-svn: 284224
* Fix use-after-freesNicolai Haehnle2016-10-141-2/+2
| | | | | | Extracted from D25313, as suggested by Justin Bogner. llvm-svn: 284220
* [mips] Fix aui/daui/dahi/dati for MIPSR6Simon Dardis2016-10-147-17/+52
| | | | | | | | | | | | For compatiblity with binutils, define these instructions to take two registers with a 16bit unsigned immediate. Both of the registers have to be same for dahi and dati. Reviewers: dsanders, zoran.jovanovic Differential Review: https://reviews.llvm.org/D21473 llvm-svn: 284218
* AMDGPU: Fix use-after-freesNicolai Haehnle2016-10-142-15/+16
| | | | | | | | | | Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25312 llvm-svn: 284215
* [x86][ms-inline-asm] use of "jmp short" in asm is not supportedMichael Zuckerman2016-10-141-0/+14
| | | | | | | | | | | | | | | | | | Committing in the name of Ziv Izhar: After check-all and LGTM . The following patch is for compatability with Microsoft. Microsoft ignores the keyword "short" when used after a jmp, for example: __asm { jmp short label label: } A test for that patch will be added in another patch, since it's located in clang's codegen tests. Link will be added shortly. link to test: https://reviews.llvm.org/D24958 Differential Revision: https://reviews.llvm.org/D24957 llvm-svn: 284211
* [DAGCombiner] Teach createBuildVecShuffle to handle cases where input ↵Craig Topper2016-10-141-5/+9
| | | | | | | | vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. llvm-svn: 284204
* In preparation for removing getNameWithPrefix off of TargetMachine,Eric Christopher2016-10-142-8/+9
| | | | | | | sink the current behavior into the callers and sink TargetMachine::getNameWithPrefix into TargetMachine::getSymbol. llvm-svn: 284203
* Tidy the calls to getCurrentSection().first -> getCurrentSectionOnly to helpEric Christopher2016-10-1411-30/+29
| | | | | | readability a bit. llvm-svn: 284202
* [AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external ↵Konstantin Zhuravlyov2016-10-146-21/+79
| | | | | | | | and global address space variables Differential Revision: https://reviews.llvm.org/D25562 llvm-svn: 284196
* [AMDGPU] Add 32-bit lo/hi got and pc relative variant kinds and emit ↵Konstantin Zhuravlyov2016-10-142-0/+16
| | | | | | | | appropriate relocations Differential Revision: https://reviews.llvm.org/D25548 llvm-svn: 284195
* Timer: Fix doxygen comments, use member initializer; NFCMatthias Braun2016-10-141-16/+12
| | | | llvm-svn: 284181
* Add interface for querying physical hardware concurrencyTeresa Johnson2016-10-141-0/+8
| | | | | | | | | | | | | | | | | | | | Summary: This will be used by ThinLTO to set the amount of backend parallelism, which performs better when restricted to the number of physical cores (on X86 at least, where getHostNumPhysicalCores is currently defined). If not available this falls back to thread::hardware_concurrency. Note I didn't add to the thread class since that is a typedef to std::thread where available. Reviewers: mehdi_amini Subscribers: beanz, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25585 llvm-svn: 284180
* CodeGen: use MSVC division on windows itaniumSaleem Abdulrasool2016-10-131-1/+2
| | | | | | | Windows itanium is identical to MSVC when dealing with everything but C++. Lower the math routines into msvcrt rather than compiler-rt. llvm-svn: 284175
* CodeGen: adjust floating point operations in Windows itaniumSaleem Abdulrasool2016-10-131-1/+2
| | | | | | | Windows itanium is equivalent to MSVC except in C++ mode. Ensure that the promote the 32-bit floating point operations to their 64-bit equivalences. llvm-svn: 284173
* [DAG] hoist DL(N) and fix formatting; NFCSanjay Patel2016-10-131-24/+31
| | | | llvm-svn: 284170
* [libFuzzer] more detailed message for disabled leak detectionKostya Serebryany2016-10-131-2/+4
| | | | llvm-svn: 284169
* LegalizeDAG: Implement PROMOTE for ISD::BITREVERSETom Stellard2016-10-131-1/+2
| | | | | | | | | | | | | | | Summary: This operation is promoted the same way was ISD::BSWAP. This will prevent a regression in test/Target/AMDGOU/bitreverse.ll when i16 support is implemented. Reviewers: bogner, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25202 llvm-svn: 284163
* [safestack] Reapply r283248 after moving X86-targeted SafeStack tests intoDavid L Kreitzer2016-10-131-7/+6
| | | | | | | | | | | | the X86 subdirectory. Original commit message: Requires a valid TargetMachine to be passed to the SafeStack pass. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24896 llvm-svn: 284161
* New llc option pie-copy-relocations to optimize access to extern globals.Sriraman Tallam2016-10-132-5/+5
| | | | | | | | | This option indicates copy relocations support is available from the linker when building as PIE and allows accesses to extern globals to avoid the GOT. Differential Revision: https://reviews.llvm.org/D24849 llvm-svn: 284160
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-10-133-121/+282
| | | | | | | | | UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157
* [RAGreedy] Empty live-ranges always succeed in last chance recoloring.Quentin Colombet2016-10-131-1/+12
| | | | | | | | | | | Relax the constraint for empty live-ranges while doing last chance recoloring. Indeed, those live-ranges do not need an actual color to be fond for the recoloring to work. Empty live-range may happen as a result of splitting/spilling. Unfortunately no test case for in-tree targets. llvm-svn: 284152
OpenPOWER on IntegriCloud