summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Factor out UnrollAnalyzer to Analysis, and add unit tests for it.Michael Zolotukhin2016-02-083-239/+193
| | | | | | | | | | | | | | | | Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169
* [X86][SSE1] Add MOVLHPS/MOVHLPS lowering and memory folding supportSimon Pilgrim2016-02-082-0/+29
| | | | | | | | | | As discussed on PR26491, this patch adds support for lowering v4f32 shuffles to the MOVLHPS/MOVHLPS instructions. It also adds support for memory folding with their MOVLPS/MOVHPS load equivalents. This first patch only really helps SSE1 targets as SSE2+ targets will widen the shuffle mask and use v2f64 equivalents (although they still combine to MOVLHPS/MOVHLPS for v2f64 splats). This will have to be addressed in a future patch, most likely when we add support for binary target shuffle combines. Differential Revision: http://reviews.llvm.org/D16956 llvm-svn: 260168
* [regalloc][WinEH] Do not mark intervals as not spillable if they contain a ↵Andrew Kaylor2016-02-082-2/+39
| | | | | | | | regmask Differential Revision: http://reviews.llvm.org/D16831 llvm-svn: 260164
* llvm-cov: Fix reading gcov data that does not have function namesJustin Bogner2016-02-081-9/+13
| | | | | | | | | | | | | | | | | | | | | In order for recent gcov versions to read the coverage data, you have to use UseCfgChecksum=true and FunctionNamesInData=false options for coverage profiling pass. This is because gcov is expecting the function section in .gcda to be exactly 3 words in size, containing ident and two checksums. While llvm-cov is compatible with UseCfgChecksum=true, it always expects a function name in .gcda function sections (it's not compatible with FunctionNamesInData=false). Thus it's currently impossible to generate one set of coverage files that works with both gcov and llvm-cov. This change fixes the reading of coverage information to only read the function name if it's present. Patch by Arseny Kapoulkine. Thanks! llvm-svn: 260162
* [WebAssembly] Update the br_if instructions' operand orders to match the spec.Dan Gohman2016-02-083-18/+18
| | | | llvm-svn: 260152
* rangify; NFCSanjay Patel2016-02-081-7/+5
| | | | llvm-svn: 260151
* [PGO] Differentiate Clang instrumentation and IR level instrumentation profilesRong Xu2016-02-083-6/+52
| | | | | | | | | | | | | | This patch uses one bit in profile version to differentiate Clang instrumentation and IR level instrumentation profiles. PGOInstrumenation generates a COMDAT variable __llvm_profile_raw_version so that the compiler runtime can set the right profile kind. PGOInstrumenation now checks this bit to make sure it's an IR level instrumentation profile. Differential Revision: http://reviews.llvm.org/D15540 llvm-svn: 260146
* [x86] convert masked store of one element to scalar storeSanjay Patel2016-02-081-2/+75
| | | | | | | | | | | Another opportunity to reduce masked stores: in D16691, we decided not to attempt the 'one mask element is set' transform in InstCombine, but this should be a win for any AVX machine. Code comments note that this transform could be extended for other targets / cases. Differential Revision: http://reviews.llvm.org/D16828 llvm-svn: 260145
* AMDGPU/SI: Implement a work-around for smrd corrupting vccz bitTom Stellard2016-02-081-1/+55
| | | | | | | | | | | | | | Summary: We will hit this once we have enabled uniform branches. The smrd-vccz-bug.ll test will be added with the uniform branch commit. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16725 llvm-svn: 260137
* [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)Hans Wennborg2016-02-083-11/+10
| | | | | | | | | | | | | | | | | | | | This matches GCC and MSVC's behaviour, and saves on code size. We were already not extending i1 return values on x86_64 after r127766. This takes that patch further by applying it to x86 target as well, and also for i8 and i16. The ABI docs have been unclear about the required behaviour here. The new i386 psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being updated to say the same [2]. Differential Revision: http://reviews.llvm.org/D16907 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ llvm-svn: 260133
* AArch64: match correct order in subtraction pattern.Tim Northover2016-02-081-4/+4
| | | | | | | The accumulator in multiply-and-subtract instructions is actually subtracted *from* so these patterns were computing the wrong value. llvm-svn: 260131
* fix typos; NFCSanjay Patel2016-02-081-17/+17
| | | | llvm-svn: 260130
* AMDGPU: Remove bfi and bfm intrinsicsMatt Arsenault2016-02-082-13/+0
| | | | | | Nothing is using them. llvm-svn: 260123
* [ThinLTO] Remove imported available externally defs from comdats.Teresa Johnson2016-02-081-2/+14
| | | | | | | | | | | | | | | Summary: Available externally definitions are considered declarations for the linker and eventually dropped. As such they are not allowed to be in comdats. Remove any such imported functions from comdats. Reviewers: rafael Subscribers: davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16120 llvm-svn: 260122
* [PGO] Enable compression in pgo instrumentationXinliang David Li2016-02-083-18/+68
| | | | | | | | | | | | This reduces sizes of instrumented object files, final binaries, process images, and raw profile data. The format of the indexed profile data remain the same. Differential Revision: http://reviews.llvm.org/D16388 llvm-svn: 260117
* [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memorySilviu Baranga2016-02-085-30/+300
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sanitizer issue. The PredicatedScalarEvolution's copy constructor wasn't copying the Generation value, and was leaving it un-initialized. Original commit message: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260112
* [JumpThreading] Change a return of ComputeValueKnownInPredecessors()Haicheng Wu2016-02-081-1/+1
| | | | | | | | | Change a return statement of ComputeValueKnownInPredecessors() to be the same as the rest return statements of the function. Otherwise, it might return true with an empty Result when the current basic block has no predecessors and trigger the first assert of JumpThreading::ProcessThreadableEdges(). llvm-svn: 260110
* SelectionDAG: Lower some range metadata to AssertZextMatt Arsenault2016-02-082-3/+45
| | | | | | | | | | If a range has a lower bound of 0, add an AssertZext from the nearest floor power of two. This allows operations with some workitem intrinsics with known maximum ranges to use fast 24-bit multiplies. llvm-svn: 260109
* [AVX512][PROLQ][PROLD] Change imm8 to intMichael Zuckerman2016-02-081-6/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D16983 llvm-svn: 260101
* [SLP] Fix placement of debug statement (NFC)Igor Breger2016-02-081-7/+7
| | | | | | | | By Ayal Zaks (ayal.zaks@intel.com) Differential Revision: http://reviews.llvm.org/D16976 llvm-svn: 260094
* Revert r260086 and r260085. They have broken the memorySilviu Baranga2016-02-085-299/+30
| | | | | | sanitizer bots. llvm-svn: 260087
* [LoopVersioning] Don't assert when there are no memchecksSilviu Baranga2016-02-081-1/+0
| | | | | | | | | | | | We shouldn't assert when there are no memchecks, since we can have SCEV checks. There is already an assert covering the case where there are no SCEV checks or memchecks. This also changes the LAA pointer wrapping versioning test to use the loop versioning pass (this was how I managed to trigger the assert in the loop versioning pass). llvm-svn: 260086
* [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided ↵Silviu Baranga2016-02-084-29/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260085
* [asan] Introduce new hidden -asan-use-private-alias option.Maxim Ostapenko2016-02-081-6/+44
| | | | | | | | | | | | | | | | As discussed in https://github.com/google/sanitizers/issues/398, with current implementation of poisoning globals we can have some CHECK failures or false positives in case of mixing instrumented and non-instrumented code due to ASan poisons innocent globals from non-sanitized binary/library. We can use private aliases to avoid such errors. In addition, to preserve ODR violation detection, we introduce new __odr_asan_gen_XXX symbol for each instrumented global that indicates if this global was already registered. To detect ODR violation in runtime, we should only check the value of indicator and report an error if it isn't equal to zero. Differential Revision: http://reviews.llvm.org/D15642 llvm-svn: 260075
* [WebAssembly] Add another optimization idea to README.txt.Dan Gohman2016-02-081-0/+5
| | | | llvm-svn: 260070
* [X86] Change FeatureIFMA string to 'avx512ifma'. Matches gcc and fixes PR26461.Craig Topper2016-02-081-1/+1
| | | | llvm-svn: 260069
* [Support] Use hexdigit. NFCCraig Topper2016-02-081-3/+2
| | | | llvm-svn: 260068
* [X86][SSE] Resolve target shuffle inputs to sentinels to permit more combinesSimon Pilgrim2016-02-071-39/+107
| | | | | | | | | | | | The combineX86ShufflesRecursively only supports unary shuffles, but was missing the opportunity to combine binary shuffles with a zero / undef second input. This patch resolves target shuffle inputs, converting the shuffle mask elements to SM_SentinelUndef/SM_SentinelZero where possible. It then resolves the updated mask to check if we have created a faux unary shuffle. Additionally, we now attempt to recursively call combineX86ShufflesRecursively for all input operands (we used to just recurse for unary integer shuffles and unary unpacks) - it safely returns early if its not a target shuffle. Differential Revision: http://reviews.llvm.org/D16683 llvm-svn: 260063
* Revert 259942, r259943, r259948.Nico Weber2016-02-071-4/+0
| | | | | | | | | | | | | | | | The Windows bots have been failing for the last two days, with: FAILED: C:\PROGRA~2\MICROS~1.0\VC\bin\amd64\cl.exe -c LLVMContextImpl.cpp D:\buildslave\clang-x64-ninja-win7\llvm\lib\IR\LLVMContextImpl.cpp(137) : error C2248: 'llvm::TrailingObjects<llvm::AttributeSetImpl, llvm::IndexAttrPair>::operator delete' : cannot access private member declared in class 'llvm::AttributeSetImpl' TrailingObjects.h(298) : see declaration of 'llvm::TrailingObjects<llvm::AttributeSetImpl, llvm::IndexAttrPair>::operator delete' AttributeImpl.h(213) : see declaration of 'llvm::AttributeSetImpl' llvm-svn: 260053
* [X86][SSE] Added support for MOVHPD/MOVLPD + MOVHPS/MOVLPS shuffle decoding.Simon Pilgrim2016-02-073-0/+48
| | | | llvm-svn: 260034
* [X86][AVX512] add intrinsics of Scalar FP to integer conversion with ↵Asaf Badouh2016-02-076-36/+74
| | | | | | | | rounding mode Differential Revision: http://reviews.llvm.org/D16629 llvm-svn: 260033
* [X86][SSE] Pulled out repeated target shuffle decodes into helper functions. ↵Simon Pilgrim2016-02-071-136/+89
| | | | | | | | | | NFCI. Pulled out the code used by PSHUFB/VPERMV/VPERMV3 shuffle mask decoding into common helper functions. The helper functions handle masks coming from BROADCAST/BUILD_VECTOR and ConstantPool nodes respectively. llvm-svn: 260032
* AVX512: VPBROADCASTB/W/D/Q from GPR intrinsics implementation.Igor Breger2016-02-073-70/+89
| | | | | | Differential Revision: http://reviews.llvm.org/D16813 llvm-svn: 260024
* Don't use module context here. It's unnecessary and makes it harder to write ↵Daniel Berlin2016-02-071-2/+2
| | | | | | unittests llvm-svn: 260015
* Compute live-in for MemorySSADaniel Berlin2016-02-071-1/+41
| | | | llvm-svn: 260014
* Only insert into definingblocks once per blockDaniel Berlin2016-02-071-1/+4
| | | | llvm-svn: 260013
* [X86][AVX512] Added support for VPMOVZX shuffle decoding.Simon Pilgrim2016-02-061-75/+35
| | | | llvm-svn: 260007
* [X86][SSE] Moved shuffle decode CASE macros earlier. NFC.Simon Pilgrim2016-02-061-48/+48
| | | | | | To allow the helper functions to make use of them. llvm-svn: 259997
* [X86][SSE] Refactored PMOVZX shuffle decoding to use scalar input typesSimon Pilgrim2016-02-063-75/+47
| | | | | | | | First step towards being able to decode AVX512 PMOVZX instructions without a massive bloat in the shuffle decode switch statement. This should also make it easier to decode X86ISD::VZEXT target shuffles in the future. llvm-svn: 259995
* [ThinLTO] Include linkage type in function summaryTeresa Johnson2016-02-063-11/+20
| | | | | | | | | | | | | | | | Summary: Adds the linkage type to both the per-module and combined function summaries, which subsumes the current islocal bit. This will eventually be used to optimized linkage types based on global summary-based analysis. Reviewers: joker.eph Subscribers: joker.eph, davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D16943 llvm-svn: 259993
* [X86][SSE] Don't replace an existing 32-bit load with its duplicateSimon Pilgrim2016-02-061-1/+2
| | | | | | | | If we are already loading a single 32-bit float/integer then just reuse it. Fix for regression in D16729 llvm-svn: 259991
* Comment fixSimon Pilgrim2016-02-061-1/+1
| | | | llvm-svn: 259990
* New Loop Versioning LICM PassAshutosh Nema2016-02-064-0/+636
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias. This uncertainty from alias analysis restricts LICM from proceeding further. In cases where alias analysis is uncertain we might use loop versioning as an alternative. Loop Versioning will create a version of the loop with aggressive aliasing assumptions in addition to the original with conservative (default) aliasing assumptions. The version of the loop making aggressive aliasing assumptions will have all the memory accesses marked as no-alias. These two versions of loop will be preceded by a memory runtime check. This runtime check consists of bound checks for all unique memory accessed in loop, and it ensures the lack of memory aliasing. The result of the runtime check determines which of the loop versions is executed: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used. The pass is off by default and can be enabled with command line option -enable-loop-versioning-licm. Reviewers: hfinkel, anemet, chatur01, reames Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga, llvm-commits Differential Revision: http://reviews.llvm.org/D9151 llvm-svn: 259986
* Relax assertion in ReplaceableMetadataImpl::replaceAllUsesWith().Adrian Prantl2016-02-061-2/+0
| | | | | | | | | | There is a legitimate use-case in clang where we need to replace a temporary placeholder node with the temporary node that may be a forward declaration. <rdar://problem/24493203> llvm-svn: 259973
* [Orc] Slightly improve the x86-64 resolver block machine code.Lang Hames2016-02-061-8/+7
| | | | | | Replace leaq + movq of a pointer with a single movabsq. llvm-svn: 259968
* [AArch64] Add the scheduling model for Exynos-M1Evandro Menezes2016-02-062-2/+361
| | | | | | | | | | | | | | Summary: Add the core scheduling model for the Samsung Exynos-M1 (ARMv8-A). Reviewers: jmolloy, rengolin, christof, MinSeongKIM, t.p.northover Subscribers: aemerson, rengolin, MatzeB Differential Revision: http://reviews.llvm.org/D16644 llvm-svn: 259958
* [StatepointLower] Use None instead of Optional<int>()Sanjoy Das2016-02-051-5/+5
| | | | llvm-svn: 259956
* [Orc] Fix a typo in the comments for the x86_64 resolver block.Lang Hames2016-02-051-2/+2
| | | | llvm-svn: 259953
* Attempt#2 to work around MSVC rejects-valid.Richard Smith2016-02-051-2/+2
| | | | llvm-svn: 259948
* More workarounds for undefined behavior exposed when compiling in C++14 withRichard Smith2016-02-051-0/+4
| | | | | | | | -fsized-deallocation. Disable sized deallocation for all objects derived from TrailingObjects, as we expect the storage allocated for these objects to be larger than the size of their dynamic type. llvm-svn: 259942
OpenPOWER on IntegriCloud