summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Add a copy constructor to StringMapHal Finkel2016-03-302-2/+62
| | | | | | | | | | There is code under review that requires StringMap to have a copy constructor, and this makes StringMap more consistent with our other containers (like DenseMap) that have copy constructors. Differential Revision: http://reviews.llvm.org/D18506 llvm-svn: 264906
* Split Writer::assignAddresses. NFC.Rui Ueyama2016-03-302-18/+29
| | | | llvm-svn: 264905
* [LoopVectorize] Don't vectorize loops when everything will be scalarizedHal Finkel2016-03-303-18/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change prevents the loop vectorizer from vectorizing when all of the vector types it generates will be scalarized. I've run into this problem on the PPC's QPX vector ISA, which only holds floating-point vector types. The loop vectorizer will, however, happily vectorize loops with purely integer computation. Here's an example: LV: The Smallest and Widest types: 32 / 32 bits. LV: The Widest register is: 256 bits. LV: Found an estimated cost of 0 for VF 1 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 1 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 1 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 1 for VF 1 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 1 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 1 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 1 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Scalar loop costs: 3. LV: Found an estimated cost of 0 for VF 2 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 2 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 2 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 2 for VF 2 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 2 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 2 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 2 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 2 costs: 2. LV: Found an estimated cost of 0 for VF 4 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 4 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 4 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 4 for VF 4 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 4 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 4 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 4 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 4 costs: 1. ... LV: Selecting VF: 8. LV: The target has 32 registers LV(REG): Calculating max register usage: LV(REG): At #0 Interval # 0 LV(REG): At #1 Interval # 1 LV(REG): At #2 Interval # 2 LV(REG): At #4 Interval # 1 LV(REG): At #5 Interval # 1 LV(REG): VF = 8 The problem is that the cost model here is not wrong, exactly. Since all of these operations are scalarized, their cost (aside from the uniform ones) are indeed VF*(scalar cost), just as the model suggests. In fact, the larger the VF picked, the lower the relative overhead from the loop itself (and the induction-variable update and check), and so in a sense, picking the largest VF here is the right thing to do. The problem is that vectorizing like this, where all of the vectors will be scalarized in the backend, isn't really vectorizing, but rather interleaving. By itself, this would be okay, but then the vectorizer itself also interleaves, and that's where the problem manifests itself. There's aren't actually enough scalar registers to support the normal interleave factor multiplied by a factor of VF (8 in this example). In other words, the problem with this is that our register-pressure heuristic does not account for scalarization. While we might want to improve our register-pressure heuristic, I don't think this is the right motivating case for that work. Here we have a more-basic problem: The job of the vectorizer is to vectorize things (interleaving aside), and if the IR it generates won't generate any actual vector code, then something is wrong. Thus, if every type looks like it will be scalarized (i.e. will be split into VF or more parts), then don't consider that VF. This is not a problem specific to PPC/QPX, however. The problem comes up under SSE on x86 too, and as such, this change fixes PR26837 too. I've added Sanjay's reduced test case from PR26837 to this commit. Differential Revision: http://reviews.llvm.org/D18537 llvm-svn: 264904
* [lld] [ELF/AArch64] Add aarch64 TLS IE to LE relax for local symbol testAdhemerval Zanella2016-03-301-7/+19
| | | | | | | This patch add a TLS relax optimization test when transforming Initial-Exec to Local-Exec for local symbols (which can not be preempted). llvm-svn: 264903
* [PGO] PGOFuncName in LTO optimizationsRong Xu2016-03-303-11/+69
| | | | | | | | | | | | | | | | | | PGOFuncNames are used as the key to retrieve the Function definition from the MD5 stored in the profile. For internal linkage function, we prefix the source file name to the PGOFuncNames. LTO's internalization privatizes many global linkage symbols. This happens after value profile annotation, but those internal linkage functions should not have a source prefix. To differentiate compiler generated internal symbols from original ones, PGOFuncName meta data are created and attached to the original internal symbols in the value profile annotation step. If a symbol does not have the meta data, its original linkage must be non-internal. Also add a new map that maps PGOFuncName's MD5 value to the function definition. Differential Revision: http://reviews.llvm.org/D17895 llvm-svn: 264902
* [cmake] Get the MSVC version by running cl rather than relying on MSVC_VERSIONReid Kleckner2016-03-301-7/+23
| | | | | | | MSVC_VERSION comes from the _MSC_VER macro, which won't correspond to the STL version if the host compiler is clang-cl. llvm-svn: 264901
* [cmake] Instead of testing char16_t for MSVC compat, directly ask cl.exe its ↵Reid Kleckner2016-03-301-8/+10
| | | | | | | | version Credit to Aaron Ballman for thinking of this. llvm-svn: 264886
* Revert 264782 and 264789Tobias Grosser2016-03-3015-1169/+138
| | | | | | | | | | | | | | | | | | | These caused LNT failures due to new assertions when running with -polly-position=before-vectorizer -polly-process-unprofitable for: FAIL: clamscan.compile_time FAIL: cjpeg.compile_time FAIL: consumer-jpeg.compile_time FAIL: shapes.compile_time FAIL: clamscan.execution_time FAIL: cjpeg.execution_time FAIL: consumer-jpeg.execution_time FAIL: shapes.execution_time The failures have been introduced by r264782, but r264789 had to be reverted as it depended on the earlier patch. llvm-svn: 264885
* Restore "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly"Teresa Johnson2016-03-307-0/+58
| | | | | | | This restores commit 264869, with a fix for windows bots to properly escape '\' in the path when serializing out. Added test. llvm-svn: 264884
* Fix header name.Jim Ingham2016-03-301-1/+1
| | | | llvm-svn: 264883
* [AArch64] Fix warnings pointed out by Hal.Chad Rosier2016-03-301-1/+5
| | | | llvm-svn: 264882
* [cmake] Add -fms-compatibility-version=19 when clang-cl gives errors about ↵Reid Kleckner2016-03-301-0/+11
| | | | | | | | | | | char16_t What we are really trying to do here is to figure out if we are using the 2015 STL. Unfortunately, so far as I know the MSVC STL does not define a version macro that we can check directly. Instead I wrote a check to see if char16_t works. llvm-svn: 264881
* [cmake] Allow EH usage with clang-clReid Kleckner2016-03-301-3/+3
| | | | llvm-svn: 264880
* [PGO] Use ArrayRef in annotateValueSite()Rong Xu2016-03-303-9/+10
| | | | | | | | | Using ArrayRef in annotateValueSite's parameter instead of using an array and it's size. Differential Revision: http://reviews.llvm.org/D18568 llvm-svn: 264879
* Include line number in error message for linker scripts.Rui Ueyama2016-03-302-2/+61
| | | | | | | This patch is based on http://reviews.llvm.org/D18545 written by George Rimar. llvm-svn: 264878
* AMDGPU/SI: Improve MachineSchedModel definitionTom Stellard2016-03-3010-181/+195
| | | | | | | | | | | | | | | | | | | | | | | | This patch contains a few improvements to the model, including: - Using a single resource with a defined buffers size for each memory unit. - Setting the IssueWidth correctly. - Fixing latency values for memory instructions. shader-db stats: 16429 shaders in 3231 tests Totals: SGPRS: 318232 -> 312328 (-1.86 %) VGPRS: 208996 -> 209346 (0.17 %) Code Size: 7147044 -> 7166440 (0.27 %) bytes LDS: 83 -> 83 (0.00 %) blocks Scratch: 1862656 -> 1459200 (-21.66 %) bytes per wave Max Waves: 49182 -> 49243 (0.12 %) Wait states: 0 -> 0 (0.00 %)A Differential Revision: http://reviews.llvm.org/D18453 llvm-svn: 264877
* AMDGPU/SI: Enable lanemask tracking in mischedTom Stellard2016-03-3031-139/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This results in higher register usage, but should make it easier for the compiler to hide latency. This pass is a prerequisite for some more scheduler improvements, and I think the increase register usage with this patch is acceptable, because when combined with the scheduler improvements, the total register usage will decrease. shader-db stats: 2382 shaders in 478 tests Totals: SGPRS: 48672 -> 49088 (0.85 %) VGPRS: 34148 -> 34847 (2.05 %) Code Size: 1285816 -> 1289128 (0.26 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 492544 -> 573440 (16.42 %) bytes per wave Max Waves: 6856 -> 6846 (-0.15 %) Wait states: 0 -> 0 (0.00 %) Depends on D18451 Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18452 llvm-svn: 264876
* [SystemZ] Add nop and nopr InstAliases.Jonas Paulsson2016-03-302-0/+11
| | | | | | | | | | For compatability with GAS, nop and nopr are recognized as alises for bc and bcr, respectively. A mask of 0 turns these instructions effectively into no-operations. Reviewed by Ulrich Weigand. llvm-svn: 264875
* [c-index-test] Delete dead function, NFCVedant Kumar2016-03-301-19/+0
| | | | llvm-svn: 264874
* [SystemZ] Specify required features for builtins.Jonas Paulsson2016-03-303-221/+239
| | | | | | | | | | | | | BuiltinsSystemZ.def is extended to include the required processor features per intrinsic. New test test/CodeGen/builtins-systemz-error2.c that checks for expected errors when instrinsics are used with a subtarget that does not support the required feature (e.g. vector support). Reviewed by Ulrich Weigand. llvm-svn: 264873
* Remove HasFnAttribute guards to getFnAttribute callsNirav Dave2016-03-305-11/+5
| | | | | | | | | | | | These checks are redundant and can be removed Reviewers: hans Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D18564 llvm-svn: 264872
* Revert "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly"Teresa Johnson2016-03-306-31/+0
| | | | | | | | | This reverts commit r264869. I am seeing Windows bot failures due to the "\" in the path being mishandled at some point (seems to be interpreted wrongly at some point and llvm-as | llvm-dis is yielding some junk characters). Need to investigate. llvm-svn: 264871
* [X86][XOP] BITREVERSE lowering using VPPERMSimon Pilgrim2016-03-302-1/+256
| | | | | | XOP's VPPERM has some great 'permute operations' that it can do as well as part of shuffling the bytes of a 128-bit vector - in this case we use it to perform BITREVERSE in a single instruction. llvm-svn: 264870
* [ThinLTO] Serialize the Module SourceFileName to/from LLVM assemblyTeresa Johnson2016-03-306-0/+31
| | | | | | | | | | | | | | | | | | | Summary: This change serializes out and in the SourceFileName to LLVM assembly so that it is preserved through "llvm-dis | llvm-as". This is necessary to ensure that the global identifiers created for local values in the module summary index are the same even if the bitcode is streamed out and read back from LLVM assembly. Serializing the summary itself to LLVM assembly is in progress. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18588 llvm-svn: 264869
* Prepare tests for change to emit Module SourceFileName to LLVM assemblyTeresa Johnson2016-03-302-2/+2
| | | | | | | | | Modify these tests to ignore the source file name when looking for the expected string. It was already catching the source file name once via the ModuleID, and will catch it another time with an impending change to LLVM to serialize out the module's SourceFileName. llvm-svn: 264868
* [X86][SSE] Test the legalization of vector comparison resultsSimon Pilgrim2016-03-301-0/+1971
| | | | | | We are currently doing a REALLY bad job of packing results of vector comparisons into the legalized <X x i1> result equivalents - a mixture of PACKSS/PMOVMSKB would be much better here. llvm-svn: 264867
* No relocation needs bot SA and ZA.Rafael Espindola2016-03-303-22/+24
| | | | | | Pass only one of them to relocateOne. llvm-svn: 264866
* Implement getImplicitAddend for mips.Rafael Espindola2016-03-301-31/+66
| | | | llvm-svn: 264865
* Simplify mips addend processing.Rafael Espindola2016-03-301-8/+6
| | | | | | | It is now added to the addend in the same way as a regular Elf_Rel addend. llvm-svn: 264864
* Fix handling of addends on i386.Rafael Espindola2016-03-304-6/+76
| | | | | | | Because of merge sections it is not sufficient to just add them while applying a relocation. llvm-svn: 264863
* [clang-tidy] Fix MSVC build.Alexander Kornienko2016-03-301-1/+2
| | | | llvm-svn: 264862
* [NVPTX] Avoid temporary std::string and make single-use function local to ↵Benjamin Kramer2016-03-302-5/+4
| | | | | | | | the cpp file. No functionality change intended. llvm-svn: 264861
* gold-plugin: Fixed typo in an error message.Marianne Mailhot-Sarrasin2016-03-301-1/+1
| | | | llvm-svn: 264860
* [clang-tidy] Adjust dangling references check to ASTMatcher changes.Gabor Horvath2016-03-301-9/+10
| | | | llvm-svn: 264859
* [docs] Added 3.8 clang-tidy release notes, fixed formatting.Alexander Kornienko2016-03-301-10/+102
| | | | llvm-svn: 264858
* [X86][SSE] Added tests for clearing upper bits of vector elementsSimon Pilgrim2016-03-301-0/+688
| | | | | | Patterns based on PR6455 llvm-svn: 264857
* [clang-tidy] readability check for const params in declarationsAlexander Kornienko2016-03-307-1/+205
| | | | | | | | | | | | | | Summary: Adds a clang-tidy warning for top-level consts in function declarations. Reviewers: hokein, sbenza, alexfh Subscribers: cfe-commits Patch by Matt Kulukundis! Differential Revision: http://reviews.llvm.org/D18408 llvm-svn: 264856
* [ASTMatchers] Existing matcher hasAnyArgument fixedGabor Horvath2016-03-304-17/+13
| | | | | | | | | | | | Summary: A checker (will be uploaded after this patch) needs to check implicit casts. The checker needs matcher hasAnyArgument but it ignores implicit casts and parenthesized expressions which disables checking of implicit casts for arguments in the checker. However the documentation of the matcher contains a FIXME that this should be removed once separate matchers for ignoring implicit casts and parenthesized expressions are ready. Since these matchers were already there the fix could be executed. Only one Clang checker was affected which was also fixed (ignoreParenImpCasts added) and is separately uploaded. Third party checkers (not in the Clang repository) may be affected by this fix so the fix must be emphasized in the release notes. Reviewers: klimek, sbenza, alexfh Subscribers: alexfh, klimek, xazax.hun, cfe-commits Differential Revision: http://reviews.llvm.org/D18243 llvm-svn: 264855
* Fix the ThreadSanitizer support to avoid creating empty SBThreads and to not ↵Kuba Brecka2016-03-302-2/+6
| | | | | | crash when thread_id is unavailable. Plus a whitespace fix. llvm-svn: 264854
* [OPENMP 4.0] Initial support for '#pragma omp declare simd' directive.Alexey Bataev2016-03-3016-9/+379
| | | | | | | | | | | | | | | | Initial parsing/sema/serialization/deserialization support for '#pragma omp declare simd' directive. The 'declare simd' construct can be applied to a function to enable the creation of one or more versions that can process multiple arguments using SIMD instructions from a single invocation from a SIMD loop. If the function has any declarations, then the declare simd construct for any declaration that has one must be equivalent to the one specified for the definition. Otherwise, the result is unspecified. This pragma can be applied many times to the same declaration. Internally this pragma is represented as an attribute. But we need special processing for this pragma because it must be used before function declaration, this directive is applied to. Differential Revision: http://reviews.llvm.org/D10599 llvm-svn: 264853
* [VectorUtils] Don't try and truncate PHIs to a smaller bitwidthJames Molloy2016-03-302-0/+73
| | | | | | | | | | We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852
* [analyzer] Fix an assertion fail in hash generation.Gabor Horvath2016-03-301-2/+5
| | | | | | | | | | | | | In case the (uniqueing) location of the diagnostic is in a line that only contains whitespaces there was an assertion fail during issue hash generation. Unfortunately I am unable to reproduce this error with the built in checkers, so no there is no failing test case with this patch. It would be possible to write a debug checker for that purpuse but it does not worth the effort. Differential Revision: http://reviews.llvm.org/D18210 llvm-svn: 264851
* Fix SocketAddressTest (again)Pavel Labath2016-03-301-1/+2
| | | | | | | On some versions of Windows, the address is returned as "::1", while on others it's "0:0:...:0:1". Accept both versions, as they represent the same address. llvm-svn: 264850
* Fix warning in ThreadSanitizerRuntimePavel Labath2016-03-301-1/+1
| | | | llvm-svn: 264849
* Fix warning in ClangExpressionParserPavel Labath2016-03-301-1/+1
| | | | llvm-svn: 264847
* Fix flakyness in TestWatchpointMultipleThreadsPavel Labath2016-03-302-13/+3
| | | | | | | | | | | | | | | | | | | | | | Summary: the inferior in the test deliberately does not lock a mutex when accessing the watched variable. The reason for that is unclear as, based on the logs, the original intention of the test was to check whether watchpoints get propagated to newly created threads, which should work fine even with a mutex. Furthermore, in the unlikely event (which I have still observed happening from time to time) that two threads do manage the execute the "critical section" simultaneously, the test will fail, as it is expecting the watchpoint "hit count" to be 1, but in this case it will be 2. Given this, I have simply chose to lock the mutex always, so that we have more predictible behavior. Watchpoints being hit simultaneously is still (and correctly!) tested by TestConcurrentEvents. Reviewers: clayborg, jingham Subscribers: lldb-commits Differential Revision: http://reviews.llvm.org/D18558 llvm-svn: 264846
* [x86] Fix a horrible bug in our lowering of x86 floating point atomicChandler Carruth2016-03-302-24/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | operations. Specifically, we had code that tried to badly approximate reconstructing all of the possible variations on addressing modes in two x86 instructions based on those in one pseudo instruction. This is not the first bug uncovered with doing this, so stop doing it altogether. Instead generically and pedantically copy every operand from the address over to both new instructions, and strip kill flags from any register operands. This fixes a subtle bug seen in the wild where we would mysteriously drop parts of the addressing mode, causing for example the index argument in the added test case to just be completely ignored. Hypothetically, this was an extremely bad miscompile because it actually caused a predictable and leveragable write of a 64bit quantity to an unintended offset (the first element of the array intead of whatever other element was intended). As a consequence, in theory this could even have introduced security vulnerabilities. However, this was only something that could happen with an atomic floating point add. No other operation could trigger this bug, so it seems extremely unlikely to have occured widely in the wild. But it did in fact occur, and frequently in scientific applications which were using relaxed atomic updates of a floating point value after adding a delta. Those would end up being quite badly miscompiled by LLVM, which is how we found this. Of course, this often looks like a race condition in the code, but it was actually a miscompile. I suspect that this whole RELEASE_FADD thing was a complete mistake. There is no such operation, and I worry that anything other than add will get remarkably worse codegeneration. But that's not for this change.... llvm-svn: 264845
* Fix shared build after r264790Ismail Donmez2016-03-301-0/+1
| | | | llvm-svn: 264844
* [ELF] - Do not keep undefined locals in .symtabGeorge Rimar2016-03-302-0/+17
| | | | | | | | | | | gold and bfd do not include the undefined locals in symtab. We have no reasons to support that either. That fixes PR27016 Differential revision: http://reviews.llvm.org/D18554 llvm-svn: 264843
* For MS ABI, emit dllexport friend functions defined inline in classStephan Bergmann2016-03-309-18/+53
| | | | | | | | | | | | | | ...as that is apparently what MSVC does. This is an updated version of r263738, which had to be reverted in r263740 due to test failures. The original version had erroneously emitted functions that are defined in class templates, too (see the updated "Handle friend functions" code in EmitDeferredDecls, lib/CodeGen/ModuleBuilder.cpp). (The updated tests needed to be split out into their own dllexport-ms-friend.cpp because of the CHECK-NOTs which would have interfered with subsequent CHECK-DAGs in dllexport.cpp.) Differential Revision: http://reviews.llvm.org/D18430 llvm-svn: 264841
OpenPOWER on IntegriCloud