summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAG] As StoreMerge now generates only legal nodes remove unecessary guard ↵Nirav Dave2017-06-151-4/+2
| | | | | | when run post-legalization NFCI. llvm-svn: 305477
* Remove trailing whitespace. NFCI.Simon Pilgrim2017-06-151-1/+1
| | | | llvm-svn: 305476
* [DAG] Defer Pre/Post IndexStore merge to after mergestore. NFCI.Nirav Dave2017-06-151-4/+4
| | | | | | | | In preparation for doing storemerge post-legalization, reorder visitSTORE passes to move pre/post-index combining after store merge. Reordered passes other than store merge are unaffected. llvm-svn: 305473
* [X86][AVX2] Fix issue in lowerV8I16GeneralSingleInputVectorShuffle that was ↵Simon Pilgrim2017-06-151-3/+4
| | | | | | | | | | assuming v8i16 vectors We can use this with v16i16/v32i16 as well. Found during fuzz testing. llvm-svn: 305472
* [AArch64] Add indexed check to splitStores. NFC.Nirav Dave2017-06-151-1/+1
| | | | | | | Add explicit check for unhandled cases in preparation for delaying splitStores to post-legalization. llvm-svn: 305471
* Revert r305465: [X86][AVX512] Improve lowering of AVX512 compare intrinsics ↵Simon Pilgrim2017-06-152-715/+25
| | | | | | | | (remove redundant shift left+right instructions). This is causing windows buildbot failures llvm-svn: 305470
* [DAG] Allow truncated and extend memory operations in Store Merge. NFCI.Nirav Dave2017-06-151-10/+21
| | | | | | | | As all store merges checks are based on the memory operation performed, allow use of truncated stores and extended loads as valid input candidates for merging. llvm-svn: 305468
* [DAG] Make MergeStores generate legalized stores. NFCI.Nirav Dave2017-06-151-4/+21
| | | | | | | Realized merged stores as truncstores if store will be realized as such by legalization. llvm-svn: 305467
* [DAG] Use correct size for truncated store merge of load. NFCI.Nirav Dave2017-06-151-2/+2
| | | | | | | Avoid non-legal memory ops by checking correct size when merging stores of loads into a extload-truncstore pair. llvm-svn: 305466
* [X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove ↵Ayman Musa2017-06-152-25/+715
| | | | | | | | | | | | | | | redundant shift left+right instructions). AVX512 compare instructions return v*i1 types. In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type. Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes. The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class. When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction. Differential Revision: https://reviews.llvm.org/D33188 llvm-svn: 305465
* [ScalarEvolution] Apply Depth limit to getMulExprMax Kazantsev2017-06-152-60/+83
| | | | | | | | | | | | | | | | | | | | | This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463
* [ARM] GlobalISel: Add support for i32 moduloDiana Picus2017-06-152-17/+82
| | | | | | | | | | | | | | | | | | Add support for modulo for targets that have hardware division and for those that don't. When hardware division is not available, we have to choose the correct libcall to use. This is generally straightforward, except for AEABI. The AEABI variant is trickier than the other libcalls because it returns { quotient, remainder }, instead of just one value like the other libcalls that we've seen so far. Therefore, we need to use custom lowering for it. However, we don't want to have too much special code, so we refactor the target-independent code in the legalizer by adding a helper for replacing an instruction with a libcall. This helper is used by the legalizer itself when dealing with simple calls, and also by the custom ARM legalization for the more complicated AEABI divmod calls. llvm-svn: 305459
* [ARM] GlobalISel: Lower only homogeneous struct argsDiana Picus2017-06-151-31/+24
| | | | | | | | | | | | | Lowering mixed struct args, params and returns used G_INSERT, which is a bit more convoluted to support through the entire pipeline. Since they don't occur that often in practice, it's probably wiser to leave them out until later. Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which has good support in the legalizer. These occur e.g. as the return of __aeabi_idivmod, so it's nice to be able to support them. llvm-svn: 305458
* [AArch64] Enable FeatureFuseAES for the generic processor model.Florian Hahn2017-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back gives a double digit speedup on benchmarks using those instructions on Cortex-A processors. In GCC, this optimization is part of the generic processor model as well. This change should not have a major performance impact on processors that do not optimize AES instruction pairs, although I only had access to Cortex-A processors for benchmarking. Reviewers: rengolin, kristof.beyls, javed.absar, evandro, silviu.baranga, MatzeB, mcrosier, joelkevinjones, joel_k_jones, bmakam, t.p.northover Reviewed By: evandro Subscribers: sbaranga, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D33836 llvm-svn: 305457
* [mips][microMIPS] Extending size reduction pass with ADDIUSP and ADDIUR1SPZoran Jovanovic2017-06-151-12/+97
| | | | | | | | | | | | Author: milena.vujosevic.janicic Reviewers: sdardis The patch extends size reduction pass for MicroMIPS. The following instructions are examined and transformed, if possible: ADDIU instruction is transformed into 16-bit instruction ADDIUSP ADDIU instruction is transformed into 16-bit instruction ADDIUR1SP Differential Revision: https://reviews.llvm.org/D33887 llvm-svn: 305455
* Fixing section name for Darwin platforms for sanitizer coverageGeorge Karpenkov2017-06-142-2/+2
| | | | | | On Darwin, section names have a 16char length limit. llvm-svn: 305429
* IR: Tweak the API around adding modules to the summary index.Peter Collingbourne2017-06-141-15/+15
| | | | | | | | | | | | | The current name (addModulePath) and return value (ModulePathStringTableTy::iterator) is a little confusing. This API adds a module, not just a path. And the iterator is basically just an implementation detail of the summary index. Address both of those issues by renaming to addModule and introducing a ModuleSummaryIndex::ModuleInfo type that the function returns. Differential Revision: https://reviews.llvm.org/D34124 llvm-svn: 305422
* PredicateInfo: Don't insert conditional info when a conditional branch jumps ↵Daniel Berlin2017-06-141-0/+3
| | | | | | to the same target regardless of condition llvm-svn: 305416
* NewGVN: This is wrong by inspection, it will not cause an issue currently ↵Daniel Berlin2017-06-141-1/+1
| | | | | | due to other limitations, i believe. This also means i can't make a test for it. llvm-svn: 305415
* [x86] avoid unnecessary shuffle mask math in combineX86ShufflesRecursively()Sanjay Patel2017-06-141-6/+7
| | | | | | | | | | | | This is a follow-up to https://reviews.llvm.org/D34174 / https://reviews.llvm.org/rL305398. We mentioned replacing the multiplies with shifts, but the real win seems to be in bypassing the extra ops in the common case when the RootRatio and OpRatio are one. This gives us another 1-2% overall win for the test in PR32037: https://bugs.llvm.org/show_bug.cgi?id=32037 llvm-svn: 305414
* Allow -profile-guided-section-prefix more than onceDavid Callahan2017-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: At present, `-profile-guided-section-prefix` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times. Our build system makes it pretty tricky to guarantee this. We often accidentally pass the flag more than once (but always with the same "false" value) which results in an error, after which compilation fails: ``` clang (LLVM option parsing): for the -profile-guided-section-prefix option: may only occur zero or one times! ``` While we work on improving our build system, it also seems reasonable just to allow `-profile-guided-section-prefix` to be passed more than once, by to `cl::ZeroOrMore`. Quoting [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed | the documentation ]]: > The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times. > ... > If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained. Reviewers: danielcdh Reviewed By: danielcdh Subscribers: twoh, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D34219 llvm-svn: 305413
* [EarlyCSE] Make PhiToCheck in removeMSSA() a set.Davide Italiano2017-06-141-2/+3
| | | | | | | | | | | This way we end up not looking at PHI args already removed. MemSSA now goes through the updater so we can prune it to avoid having redundant MemoryPHI arguments, but that doesn't quite work for the general case. Discussed with Daniel Berlin, fixes PR33406. llvm-svn: 305409
* Hide dbgs() stream for when built with -fmodules.Frederich Munch2017-06-142-1/+16
| | | | | | | | | | | | | | Summary: Make DebugCounter::print and dump methods to be const correct. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34214 llvm-svn: 305408
* MC, Object: Reserve a section type, SHT_LLVM_ODRTAB, for the ODR table.Peter Collingbourne2017-06-144-0/+6
| | | | | | | | | | | | | | This is part of the ODR checker proposal: http://lists.llvm.org/pipermail/llvm-dev/2017-June/113820.html Per discussion on the gnu-gabi mailing list [1] the section type range 0x6fff4c00..0x6fff4cff is reserved for LLVM. [1] https://sourceware.org/ml/gnu-gabi/2017-q2/msg00030.html Differential Revision: https://reviews.llvm.org/D33978 llvm-svn: 305407
* Specified ReportError as noreturn friendly to old compilers.Galina Kistanova2017-06-141-9/+14
| | | | llvm-svn: 305405
* Test commit - NFC.Lei Huang2017-06-141-1/+1
| | | | | | Modified a comment to confirm commit access functionality. llvm-svn: 305402
* [ValueTracking] Correct early out in computeKnownBitsFromOperator to work ↵Craig Topper2017-06-141-1/+2
| | | | | | | | | | | | with non power of 2 bit widths There's an early out that's trying to detect when we don't know any bits that make up the legal range of a shift. The code subtracts one from BitWidth which creates a mask in the lower bits for power of 2 bit widths. This is then ANDed with the known bits to see if any of those bits are known. If the bit width isn't a power of 2 this creates a non-sensical mask. This patch corrects this by rounding up to a power of 2 before doing the subtract and mask. Differential Revision: https://reviews.llvm.org/D34165 llvm-svn: 305400
* [x86] replace div/rem with shift/mask for better shuffle combining perfSanjay Patel2017-06-141-13/+31
| | | | | | | | | | | | We know that shuffle masks are power-of-2 sizes, but there's no way (?) for LLVM to know that, so hack combineX86ShufflesRecursively() to be much faster by replacing div/rem with shift/mask. This makes the motivating compile-time bug in PR32037 ( https://bugs.llvm.org/show_bug.cgi?id=32037 ) about 9% faster overall. Differential Revision: https://reviews.llvm.org/D34174 llvm-svn: 305398
* [gtest] Create a shared include directory for gtest utilities.Zachary Turner2017-06-147-0/+78
| | | | | | | | | | | | | | | | | | | | | | | Many times unit tests for different libraries would like to use the same helper functions for checking common types of errors. This patch adds a common library with helpers for testing things in Support, and introduces helpers in here for integrating the llvm::Error and llvm::Expected<T> classes with gtest and gmock. Normally, we would just be able to write: EXPECT_THAT(someFunction(), succeeded()); but due to some quirks in llvm::Error's move semantics, gmock doesn't make this easy, so two macros EXPECT_THAT_ERROR() and EXPECT_THAT_EXPECTED() are introduced to gloss over the difficulties. Consider this an exception, and possibly only temporary as we look for ways to improve this. Differential Revision: https://reviews.llvm.org/D33059 llvm-svn: 305395
* Resubmit "[codeview] Make obj2yaml/yaml2obj support .debug$S..."Zachary Turner2017-06-1411-217/+277
| | | | | | | | | This was originally reverted because of some non-deterministic failures on certain buildbots. Luckily ASAN eventually caught this as a stack-use-after-scope, so the fix is included in this patch. llvm-svn: 305393
* Revert "[ARM] Support constant pools in data when generating execute-only code."Alexandros Lamprineas2017-06-143-43/+15
| | | | | | | | | | | This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0. I've upset a buildbot which runs the address sanitizer: ERROR: AddressSanitizer: stack-use-after-scope lib/Target/ARM/ARMISelLowering.cpp:2690 That Twine variable is used illegally. llvm-svn: 305390
* [mips] Fix multiprecision arithmetic.Simon Dardis2017-06-145-230/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC, get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs. For MIPS, only the DSP ASE has a carry flag, so in the general case it is not useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes. Also improve the generation code in such cases for targets with TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the comparison node rather than using it in selects. Similarly for ISD::SUBE / ISD::SUBC. Address optimization breakage by moving the generation of MIPS specific integer multiply-accumulate nodes to before legalization. This revolves PR32713 and PR33424. Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33494 llvm-svn: 305389
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-143-15/+43
| | | | | | | | | | | | | | The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305387
* Align definition of DW_OP_plus with DWARF spec [3/3]Florian Hahn2017-06-149-64/+133
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things. The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack. This is done in three stages: • The first patch (LLVM) adds support for DW_OP_plus_uconst. • The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst. • The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions. Patch by Sander de Smalen. Reviewers: echristo, pcc, aprantl Reviewed By: aprantl Subscribers: fhahn, javed.absar, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D33894 llvm-svn: 305386
* [mips] Fix machine verifier errors in the long branch passSimon Dardis2017-06-141-6/+6
| | | | | | | | | | | | | | | This patch fixes two systemic machine verifier errors in the long branch pass. The first is the incorrect basic block successors and the second was the incorrect construction of several jump instructions. This partially resolves PR27458 and the associated PR32146. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33378 llvm-svn: 305382
* Revert r304907 as it is causing some failures that I cannot reproduce.Nemanja Ivanovic2017-06-141-26/+0
| | | | | | Reverting this until a test case can be provided to aid the investigation. llvm-svn: 305372
* Revert "[codeview] Make obj2yaml/yaml2obj support .debug$S..."Zachary Turner2017-06-1410-275/+205
| | | | | | | | This is causing failures on linux bots with an invalid stream read. It doesn't repro in any configuration on Windows, so reverting until I have a chance to investigate on Linux. llvm-svn: 305371
* Use make_shared instead of make_unique.Zachary Turner2017-06-141-2/+2
| | | | llvm-svn: 305369
* Fix some more errors.Zachary Turner2017-06-142-11/+11
| | | | llvm-svn: 305368
* [codeview] Make obj2yaml/yaml2obj support .debug$S/T sections.Zachary Turner2017-06-1410-198/+268
| | | | | | | | | | | | | | | This allows us to use yaml2obj and obj2yaml to round-trip CodeView symbol and type information without having to manually specify the bytes of the section. This makes for much easier to maintain tests. See the tests under lld/COFF in this patch for example. Before they just said SectionData: <blob> whereas now we can use meaningful record descriptions. Note that it still supports the SectionData yaml field, which could be useful for initializing a section to invalid bytes for testing, for example. Differential Revision: https://reviews.llvm.org/D34127 llvm-svn: 305366
* Support: Remove MSVC 2013 workarounds in ThreadPool class.Peter Collingbourne2017-06-141-16/+3
| | | | | | | | I have confirmed that these are no longer needed with MSVC 2015. Differential Revision: https://reviews.llvm.org/D34187 llvm-svn: 305347
* [libFuzzer] really restrict the new test to Linux (fails on Mac/Windows ↵Kostya Serebryany2017-06-141-1/+3
| | | | | | currently) llvm-svn: 305346
* Added partial verification for .apple_names accelerator table in ↵Spyridoula Gravani2017-06-143-0/+46
| | | | | | | | | | | llvm-dwarfdump output. This patch adds code which verifies that each bucket in the .apple_names accelerator table is either empty or has a valid hash index. Differential Revision: https://reviews.llvm.org/D34177 llvm-svn: 305344
* Reverted r305339 as MSVC is not happy with noreturn in lambda.Galina Kistanova2017-06-131-1/+2
| | | | llvm-svn: 305343
* [globalisel][legalizer] G_LOAD/G_STORE NarrowScalar should not emit G_GEP x, 0.Daniel Sanders2017-06-132-12/+33
| | | | | | | | | | | | | | | | | | | Summary: When legalizing G_LOAD/G_STORE using NarrowScalar, we should avoid emitting %0 = G_CONSTANT ty 0 %1 = G_GEP %x, %0 since it's cheaper to not emit the redundant instructions than it is to fold them away later. Reviewers: qcolombet, t.p.northover, ab, rovka, aditya_nandakumar, kristof.beyls Reviewed By: qcolombet Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D32746 llvm-svn: 305340
* Specified LLVM_ATTRIBUTE_NORETURN for ReportError.Galina Kistanova2017-06-131-2/+1
| | | | llvm-svn: 305339
* [libFuzzer] restrict the new test to Linux (fails on Mac currently)Kostya Serebryany2017-06-131-0/+1
| | | | llvm-svn: 305335
* [libFuzzer] initial support of -fsanitize-coverage=inline-8bit-counters in ↵Kostya Serebryany2017-06-138-9/+83
| | | | | | libFuzzer. This is not fully functional yet, but simple tests work llvm-svn: 305331
* [AMDGPU] Remove now dead defaultOffsetS13(). NFCI.Davide Italiano2017-06-131-5/+0
| | | | | | Fixes the GCC7 build with -Werror. llvm-svn: 305329
* [InstrProf] Don't take the address of alwaysinline available_externally ↵Vedant Kumar2017-06-131-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | functions Doing so breaks compilation of the following C program (under -fprofile-instr-generate): __attribute__((always_inline)) inline int foo() { return 0; } int main() { return foo(); } At link time, we fail because taking the address of an available_externally function creates an undefined external reference, which the TU cannot provide. Emitting the function definition into the object file at all appears to be a violation of the langref: "Globals with 'available_externally' linkage are never emitted into the object file corresponding to the LLVM module." Differential Revision: https://reviews.llvm.org/D34134 llvm-svn: 305327
OpenPOWER on IntegriCloud