summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Combine add and adde, sub and subeStanislav Mekhanoshin2017-06-213-9/+161
| | | | | | | | | If one of the arguments of adde/sube is zero we can fold another add/sub into it. Differential Revision: https://reviews.llvm.org/D34374 llvm-svn: 305964
* Mark dump() methods as const. NFCSam Clegg2017-06-2115-27/+27
| | | | | | | | | Add const qualifier to any dump() method where adding one was trivial. Differential Revision: https://reviews.llvm.org/D34481 llvm-svn: 305963
* [AMDGPU] simplify add x, *ext (setcc) => addc|subb x, 0, setccStanislav Mekhanoshin2017-06-215-0/+102
| | | | | | | | | This simplification allows to avoid generating v_cndmask_b32 to serialize condition code between compare and use. Differential Revision: https://reviews.llvm.org/D34300 llvm-svn: 305962
* TableGen.cmake: Use DEPFILE for Ninja Generator with CMake>=3.7.NAKAMURA Takumi2017-06-211-3/+26
| | | | | | | | | | | CMake emits build targets as relative paths (from build.ninja) but Ninja doesn't identify absolute path (in *.d) as relative path (in build.ninja). So, let file names, in the command line, relative from ${CMAKE_BINARY_DIR}, where build.ninja is. Note that tblgen is executed on ${CMAKE_BINARY_DIR} as working directory. Differential Revision: https://reviews.llvm.org/D33707 llvm-svn: 305961
* Enable vectorizer-maximize-bandwidth by default.Dehao Chen2017-06-2112-68/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 305960
* SwiftCC: Perform physical layout when computing coercion typesArnold Schwaighofer2017-06-212-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | We need to take type alignment padding into account whe computing physical layouts. The layout must be compatible with the input layout, offsets are defined in terms of offsets within a packed struct which are computed in terms of the alloc size of a type. Usingthe store size we would insert padding for the following type for example: struct { int3 v; long long l; } __attribute((packed)) On x86-64 int3 is padded to int4 alignment. The swiftcc type would be <{ <3 x float>, [4 x i8], i64 }> which is not compatible with <{ <3 x float>, i64 }>. The latter has i64 at offset 16 and the former at offset 20. rdar://32618125 llvm-svn: 305956
* Attempt to avoid static init ordering issues with globalMemCounterEric Fiselier2017-06-211-5/+10
| | | | llvm-svn: 305955
* ELF: Don't dereference Repl in MarkLive. NFCI.Peter Collingbourne2017-06-211-1/+1
| | | | | | | | This is unnecessary because --gc-sections runs before ICF. Differential Revision: https://reviews.llvm.org/D34465 llvm-svn: 305954
* [Hexagon] Use MachineInstrBuilder instead of changing instruction in placeKrzysztof Parzyszek2017-06-211-45/+9
| | | | llvm-svn: 305953
* Rename WinCOFFStreamer.cpp -> MCWinCOFFStreamer.cppSam Clegg2017-06-212-2/+2
| | | | | | | | | For consistency with other MC*Streamer.cpp files and the header file. Differential Revision: https://reviews.llvm.org/D34466 llvm-svn: 305952
* Add Aarch64 ldst-opt test.Nirav Dave2017-06-211-0/+60
| | | | llvm-svn: 305951
* [Target/Mips] Add test associated with r305949.Davide Italiano2017-06-211-0/+13
| | | | llvm-svn: 305950
* [Target] Implement the ".rdata" MIPS assembly directive.Davide Italiano2017-06-211-0/+22
| | | | | | | | Patch by John Baldwin < jhb at freebsd dot org >! Differential Revision: https://reviews.llvm.org/D34452 llvm-svn: 305949
* [Solaris] emit .init_array instead of .ctors on Solaris (Sparc/x86)Davide Italiano2017-06-217-0/+52
| | | | | | | | Patch by Fedor Sergeev. Differential Revision: https://reviews.llvm.org/D33868 llvm-svn: 305948
* [test] Make absolute line numbers relative; NFCGeorge Burgess IV2017-06-211-10/+10
| | | | | | | Done to remove noise from https://reviews.llvm.org/D32332 (and to make this test more resilient to changes in general). llvm-svn: 305947
* [Reassociate] Use early returns in a couple places to reduce indentation and ↵Craig Topper2017-06-211-26/+26
| | | | | | improve readability. NFC llvm-svn: 305946
* [Reassociate] Const correct a helper function. NFCCraig Topper2017-06-211-2/+2
| | | | llvm-svn: 305945
* [DWARF] Support for DW_FORM_strx3 and complete support for DW_FORM_strx{1,2,4}Wolfgang Pieb2017-06-216-13/+148
| | | | | | | | | | (consumer). Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D34418 llvm-svn: 305944
* [Hexagon] Handle more types of immediate operands in expand-condsetsKrzysztof Parzyszek2017-06-212-2/+35
| | | | llvm-svn: 305943
* [sanitizer-coverage] Stop marking this test as unsupported on DarwinJustin Bogner2017-06-211-1/+1
| | | | | | The bug that was causing this to fail was fixed in r305429. llvm-svn: 305942
* [InstCombine] Cleanup using commutable matchers. Make a couple helper ↵Craig Topper2017-06-212-25/+19
| | | | | | methods standalone static functions. Put 'if' around variable declaration instead of after. NFC llvm-svn: 305941
* [preprocessor] Fix assertion hit when 'SingleFileParseMode' option is ↵Argyrios Kyrtzidis2017-06-212-6/+16
| | | | | | | | enabled and #if with an undefined identifier and without #else 'HandleEndifDirective' asserts that 'WasSkipping' is false, so switch to using 'FoundNonSkip' as the hint for 'SingleFileParseMode' to keep going with parsing. llvm-svn: 305940
* Add a "probe-stack" attributewhitequark2017-06-214-0/+41
| | | | | | | | | | | | | This attribute is used to ensure the guard page is triggered on stack overflow. Stack frames larger than the guard page size will generate a call to __probestack to touch each page so the guard page won't be skipped. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D34386 llvm-svn: 305939
* [BasicAA] Use MayAlias instead of PartialAlias for fallback.Michael Kruse2017-06-2113-87/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using various methods, BasicAA tries to determine whether two GetElementPtr memory locations alias when its base pointers are known to be equal. When none of its heuristics are applicable, it falls back to PartialAlias to, according to a comment, protect TBAA making a wrong decision in case of unions and malloc. PartialAlias is not correct, because a PartialAlias result implies that some, but not all, bytes overlap which is not necessarily the case here. AAResults returns the first analysis result that is not MayAlias. BasicAA is always the first alias analysis. When it returns PartialAlias, no other analysis is queried to give a more exact result (which was the intention of returning PartialAlias instead of MayAlias). For instance, ScopedAA could return a more accurate result. The PartialAlias hack was introduced in r131781 (and re-applied in r132632 after some reverts) to fix llvm.org/PR9971 where TBAA returns a wrong NoAlias result due to a union. A test case for the malloc case mentioned in the comment was not provided and I don't think it is affected since it returns an omnipotent char anyway. Since r303851 (https://reviews.llvm.org/D33328) clang does emit specific TBAA for unions anymore (but "omnipotent char" instead). Hence, the PartialAlias workaround is not required anymore. This patch passes the test-suite and check-llvm/check-clang of a self-hoisted build on x64. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D34318 llvm-svn: 305938
* Object: Have the irsymtab builder take a string table builder. NFCI.Peter Collingbourne2017-06-212-19/+23
| | | | | | | | | This will be needed in order to share the irsymtab string table with the bitcode string table. Differential Revision: https://reviews.llvm.org/D33971 llvm-svn: 305937
* [CGP, memcmp] replace CreateZextOrTrunc with CreateZext because it can never ↵Sanjay Patel2017-06-211-5/+7
| | | | | | trunc llvm-svn: 305936
* [CGP] fix variables to be unsigned in memcmp expansionSanjay Patel2017-06-211-12/+14
| | | | llvm-svn: 305935
* Do not inline recursive direct calls in sample loader pass.Dehao Chen2017-06-213-0/+22
| | | | | | | | | | | | | | Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls. Reviewers: iteratee, davidxl Reviewed By: iteratee Subscribers: sanjoy, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D34456 llvm-svn: 305934
* [PDB] Add symbols to the PDBReid Kleckner2017-06-216-10/+659
| | | | | | | | | | | | | | | | | | | Summary: The main complexity in adding symbol records is that we need to "relocate" all the type indices. Type indices do not have anything like relocations, an opaque data structure describing where to find existing type indices for fixups. The linker just has to "know" where the type references are in the symbol records. I added an overload of `discoverTypeIndices` that works on symbol records, and it seems to be able to link the standard library. Reviewers: zturner, ruiu Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D34432 llvm-svn: 305933
* [PowerPC] define target hook isReallyTriviallyReMaterializable()Lei Huang2017-06-214-2/+208
| | | | | | | | | | | Define target hook isReallyTriviallyReMaterializable() to explicitly specify PowerPC instructions that are trivially rematerializable. This will allow the MachineLICM pass to accurately identify PPC instructions that should always be hoisted. Differential Revision: https://reviews.llvm.org/D34255 llvm-svn: 305932
* [x86] set the datalayout to match the RUN line triple; NFCSanjay Patel2017-06-211-4/+2
| | | | | | | I don't think there's any visible difference from having the wrong layout for the 32-bit case at this point, but that could change in the future. llvm-svn: 305931
* Use -NOT prefix instead of adding `not` to FileCheck.Rui Ueyama2017-06-211-2/+2
| | | | | | | | | | If we want to make sure that a particular string is not in an output, the regular way of doing it is to add `-NOT` prefix instead of checking if FileCheck resulted in an error. Differential Revision: https://reviews.llvm.org/D34435 llvm-svn: 305930
* [COFF] Set MajorLinkerVersion to 14 instead of 0.Rui Ueyama2017-06-212-1/+10
| | | | | | | | | | | | This works around a strange interaction with Authenticode signatures, in which a signed PE executable with {Major,Minor}LinkerVersion = 0.0 fails to validate on Windows 7 (but is OK on Windows 10). Setting the linker version to 14.0 (which is what VS2015 outputs) makes it work again. Patch by Simon Tatham <simon.tatham@arm.com>. llvm-svn: 305929
* Correct VectorCall x86 (32 bit) behavior for SSE Register AssignmentErich Keane2017-06-212-76/+71
| | | | | | | | | | | | | | | | | | | In running some internal vectorcall tests in 32 bit mode, we discovered that the behavior I'd previously implemented for x64 (and applied to x32) regarding the assignment of SSE registers was incorrect. See spec here: https://msdn.microsoft.com/en-us/library/dn375768.aspx My previous implementation applied register argument position from the x64 version to both. This isn't correct for x86, so this removes and refactors that section. Additionally, it corrects the integer/int-pointer assignments. Unlike x64, x86 permits integers to be assigned independent of position. Finally, the code for 32 bit was cleaned up a little to clarify the intent, as well as given a descriptive comment. Differential Revision: https://reviews.llvm.org/D34455 llvm-svn: 305928
* [InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on ↵Craig Topper2017-06-213-18/+108
| | | | | | | | | | | | | | | | | | | | | known bits Summary: I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know. This patch adds range metadata to these intrinsics based on the known bits. Currently the patch punts if we already have range metadata present. Reviewers: spatel, RKSimon, davide, majnemer Reviewed By: RKSimon Subscribers: sanjoy, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D32582 llvm-svn: 305927
* [InstCombine] Don't let folding (select (icmp eq (and X, C1), 0), Y, (or Y, ↵Craig Topper2017-06-212-40/+39
| | | | | | | | | | | | | | | | | | | C2)) create more instructions than it removes Summary: Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184 This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34437 llvm-svn: 305926
* [Reassociate] Support xor reassociating for splat vectorsCraig Topper2017-06-212-24/+123
| | | | | | | | | | | | | | Summary: This patch adds support for xors of splat vectors. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34354 llvm-svn: 305925
* Change -1LL to -1ULL to silence a gcc warning about left shifting a negative ↵Marshall Clow2017-06-211-3/+3
| | | | | | value. Fixes https://bugs.llvm.org/show_bug.cgi?id=33358 llvm-svn: 305924
* [AMDGPU][MC][GFX9] Corrected VOP3P relevant code to fix disassembler failuresDmitry Preobrazhensky2017-06-215-11/+1758
| | | | | | | | | | See Bug 33509: https://bugs.llvm.org//show_bug.cgi?id=33509 Reviewers: Sam Kolton, Artem Tamazov, Valery Pykhtin Differential Revision: https://reviews.llvm.org/D34360 llvm-svn: 305923
* [sanitizer] Add a function to gather random bytesKostya Kortchinsky2017-06-215-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: AFAICT compiler-rt doesn't have a function that would return 'good' random bytes to seed a PRNG. Currently, the `SizeClassAllocator64` uses addresses returned by `mmap` to seed its PRNG, which is not ideal, and `SizeClassAllocator32` doesn't benefit from the entropy offered by its 64-bit counterpart address space, so right now it has nothing. This function aims at solving this, allowing to implement good 32-bit chunk randomization. Scudo also has a function that does this for Cookie purposes, which would go away in a later CL once this lands. This function will try the `getrandom` syscall if available, and fallback to `/dev/urandom` if not. Unfortunately, I do not have a way to implement and test a Mac and Windows version, so those are unimplemented as of now. Note that `kRandomShuffleChunks` is only used on Linux for now. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: zturner, rnk, llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D34412 llvm-svn: 305922
* [DAG] Move BaseIndexOffset into separate Libarary. NFC.Nirav Dave2017-06-214-114/+161
| | | | | | | Move BaseIndexOffset analysis out of DAGCombiner for use in other files. llvm-svn: 305921
* Implement the --exclude-libs option.Rui Ueyama2017-06-218-7/+92
| | | | | | | | | The --exclude-libs option is not a popular option, but at least some programs in Android depend on it, so it's worth to support it. Differential Revision: https://reviews.llvm.org/D34422 llvm-svn: 305920
* ClangFormat some changes from r305226David Blaikie2017-06-211-2/+4
| | | | | | Post commit review feedback from Justin Bogner llvm-svn: 305919
* [AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.Christof Douma2017-06-211-0/+683
| | | | | | | | | | | | Added test file for ARMv8.1 LSE Atomics that I forgot to include in commit r305893. Patch by Ananth Jasty. Differential Revision: https://reviews.llvm.org/D33586 Change-Id: Ic1ad8ed87c1b584c4c791b459a686c866a3c3087 llvm-svn: 305918
* [DAG] Remove Node csonstruction from BaseIndexOffset match. NFCI.Nirav Dave2017-06-211-52/+69
| | | | | | | | Move GlobalAddress Offset decomposition from initial match into comparision check and removing the possibility of constructing a new offseted global address when examining addresses. llvm-svn: 305917
* [X86][SSE] Dropped -mcpu from 256-bit vector shuffle testsSimon Pilgrim2017-06-214-20/+12
| | | | | | Use triple and attribute only for consistency llvm-svn: 305916
* [AMDGPU][MC] Corrected V_*QSAD* instructions to check that dest register is ↵Dmitry Preobrazhensky2017-06-216-14/+117
| | | | | | | | | | | | different than any of the src See Bug 33279: https://bugs.llvm.org//show_bug.cgi?id=33279 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D34003 llvm-svn: 305915
* [x86] fix formatting; NFCSanjay Patel2017-06-211-15/+13
| | | | llvm-svn: 305914
* [X86][SSE] Dropped -mcpu from 128-bit vector shuffle testsSimon Pilgrim2017-06-214-38/+26
| | | | | | Use triple and attribute only for consistency llvm-svn: 305913
* clang-format: introduce InlineOnly short function styleFrancois Ferrand2017-06-215-4/+69
| | | | | | | | | | | | | | | | | | | Summary: This is the same as Inline, except it does not imply all empty functions are merged: with this style, empty functions are merged only if they also match the 'inline' criteria (i.e. defined in a class). This is helpful to avoid inlining functions in implementations files. Reviewers: djasper, krasimir Reviewed By: djasper Subscribers: klimek, rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D34399 llvm-svn: 305912
OpenPOWER on IntegriCloud