summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Set correct insertion point for selects generated while ↵Anna Thomas2017-06-161-0/+29
| | | | | | | | | | | | | | | | | | | | | | folding phis Summary: When we fold vector constants that are operands of phi's that feed into select, we need to set the correct insertion point for the *new* selects that get generated. The correct insertion point is the incoming block for the phi. Such cases can occur with patch r298845, which fixed folding of vector constants, but the new selects could be inserted incorrectly (as the added test case shows). Reviewers: majnemer, spatel, sanjoy Reviewed by: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34162 llvm-svn: 305591
* Change YAML traits for vector<string> to flow_vector.Evgeniy Stepanov2017-06-161-7/+2
| | | | | | | This is a workaround for an ODR conflict with the definition in AMDGPUCodeObjectMetadata.cpp. llvm-svn: 305584
* [GVN] Recommit the patch "Add phi-translate support in scalarpre".Wei Mi2017-06-163-4/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recommit fixes two bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 305578
* Revert "RegScavenging: Add scavengeRegisterBackwards()"Matthias Braun2017-06-1612-181/+117
| | | | | | | | | Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. llvm-svn: 305566
* [InstCombine] Add test cases to show missed opportunities due to overly ↵Craig Topper2017-06-162-0/+77
| | | | | | conservative single use checks. NFC llvm-svn: 305562
* bpf: avoid load from read-only sectionsYonghong Song2017-06-164-0/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | If users tried to have a structure decl/init code like below struct test_t t = { .memeber1 = 45 }; It is very likely that compiler will generate a readonly section to hold up the init values for variable t. Later load of t members, e.g., t.member1 will result in a read from readonly section. BPF program cannot handle relocation. This will force users to write: struct test_t t = {}; t.member1 = 45; This is just inconvenient and unintuitive. This patch addresses this issue by implementing BPF PreprocessISelDAG. For any load from a global constant structure or an global array of constant struct, it attempts to translate it into a constant directly. The traversal of the constant struct and other constant data structures are similar to where the assembler emits read-only sections. Four different unit test cases are also added to cover different scenarios. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305560
* [Atomics] Rename and change prototype for atomic memcpy intrinsicDaniel Neilson2017-06-165-61/+72
| | | | | | | | | | | | | | | | | | Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558
* Revert "[mips][microMIPS] Extending size reduction pass with ADDIUSP and ↵Simon Dardis2017-06-161-17/+0
| | | | | | | | | ADDIUR1SP" This reverts commit r305455. This commit was reported as breaking one of the sanitizer buildbots. Reverting until lab.llvm.org comes back online. llvm-svn: 305557
* [Hexagon] Don't kill live registers when creating mux out of tfrKrzysztof Parzyszek2017-06-161-0/+17
| | | | | | | | | The second part of r305300: when placing the mux at the later location, make sure that it won't use any register that was killed between the two original instructions. Remove any such kills and transfer them to the mux. llvm-svn: 305553
* [InstCombine] Fold (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 | K2)) ↵Craig Topper2017-06-161-12/+8
| | | | | | | | | | | | | | | | == (K1 | K2) if K1 and K2 are a 1-bit mask Summary: This is the demorganed version of the case we already handle for the OR of iszero. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34244 llvm-svn: 305548
* [cfi] CFI-ICall for ThinLTO.Evgeniy Stepanov2017-06-165-0/+181
| | | | | | | | Implement ControlFlowIntegrity for indirect function calls in ThinLTO. Design follows the RFC in llvm-dev, see https://groups.google.com/d/msg/llvm-dev/MgUlaphu4Qc/kywu0AqjAQAJ llvm-svn: 305533
* [llvm-pdbutil] Add support for dumping lines and inlinee lines.Zachary Turner2017-06-151-0/+11
| | | | llvm-svn: 305529
* [llvm-pdbutil] Add back support for dumping file checksums.Zachary Turner2017-06-151-399/+371
| | | | | | When dumping module source files, also dump checksums. llvm-svn: 305526
* [llvm-pdbutil] Add back the ability to dump hashes and index offsets.Zachary Turner2017-06-151-90/+96
| | | | | | | This was regressed in a previous patch that re-wrote the dumper, and I'm incrementally adding back the pieces that are missing. llvm-svn: 305524
* Resubmit "[llvm-pdbutil] rewrite the "raw" output style."Zachary Turner2017-06-1510-3372/+1274
| | | | | | | | | This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7. It was broken due to some weird template issues, which have since been fixed. llvm-svn: 305517
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2017-06-1512-117/+181
| | | | | | | | | | | | | | | | | | Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305516
* [InstCombine] Add test cases to demonstrate instcombine increasing ↵Craig Topper2017-06-151-0/+237
| | | | | | instruction count when trying to fold (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y) when the pieces have multiple uses. llvm-svn: 305509
* Revert "[llvm-pdbutil] rewrite the "raw" output style."Zachary Turner2017-06-1510-1274/+3372
| | | | | | | | | This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad. This is failing due to some strange template problems, so reverting until it can be straightened out. llvm-svn: 305505
* [llvm-pdbutil] rewrite the "raw" output style.Zachary Turner2017-06-1510-3372/+1274
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After some internal discussions, we agreed that the raw output style had outlived its usefulness. It was originally created before we had even thought of dumping to YAML, and it was intended to give us some insight into the internals of a PDB file. Now we have YAML mode which does almost exactly this but is more powerful in that it can round-trip back to a PDB, which the raw mode could not do. So the raw mode had become purely a maintenance burden. One option was to just delete it. However, its original goal was to be as readable as possible while staying close to the "metal" - i.e. presenting the output in a way that maps directly to the underlying file format. We don't actually need that last requirement anymore since it's covered by the yaml mode, so we could repurpose "raw" mode to actually just be as readable as possible. This patch implements about 80% of the functionality previously in raw mode, but in a completely different style that is more akin to what cvdump outputs. Records are very compressed, often times appearing on just one line. One nice thing about this is that it makes full record matching easier, because you can grep for indices, names, and leaf types on a single line often. See the tests for some examples of what the new output looks like. Note that this patch actually regresses the functionality of raw mode in a few areas, but only because the patch was already unreasonably large and going 100% would have been even worse. Specifically, this patch is missing: The ability to dump module debug subsections (checksums, lines, etc) The ability to dump section headers Aside from that everything is here. While goign through the tests fixing them all up, I found many duplicate tests. They've been deleted. In subsequent patches I will go through and re-add the missing functionality. Differential Revision: https://reviews.llvm.org/D34191 llvm-svn: 305495
* DivergencyAnalysis patch for reviewAlexander Timofeev2017-06-152-0/+54
| | | | llvm-svn: 305494
* [InstCombine] Pre-commit test cases for the transform proposed in D34244.Craig Topper2017-06-151-0/+58
| | | | llvm-svn: 305492
* [MachineLICM] Hoist TOC-based address instructionsLei Huang2017-06-151-0/+110
| | | | | | | | | | | | | | | | | | Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490
* [InstCombine] Handle (iszero(A & K1) | iszero(A & K2)) -> (A & (K1 | K2)) != ↵Craig Topper2017-06-151-9/+4
| | | | | | | | | | | | (K1 | K2) when the one of the Ands is commuted relative to the other Currently we expect A to be on the same side in both Ands but nothing guarantees that. While there also switch to using matchers for some of the code. Differential Revision: https://reviews.llvm.org/D34230 llvm-svn: 305487
* ISel: Fix FastISel of swifterror valuesArnold Schwaighofer2017-06-153-0/+163
| | | | | | | | | | | | The code assumed that we process instructions in basic block order. FastISel processes instructions in reverse basic block order. We need to pre-assign virtual registers before selecting otherwise we get def-use relationships wrong. This only affects code with swifterror registers. rdar://32659327 llvm-svn: 305484
* [BasicAA] Add test case that goes with r305481.Craig Topper2017-06-151-0/+53
| | | | | | Forgot to 'git add' the file. llvm-svn: 305483
* Apply summary-based dead stripping to regular LTO modules with summaries.Peter Collingbourne2017-06-152-0/+53
| | | | | | | | | | | | | | | If a regular LTO module has a summary index, then instead of linking it into the combined regular LTO module right away, add it to the combined summary index and associate it with a special module that represents the combined regular LTO module. Any such modules are linked during LTO::run(), at which time we use the results of summary-based dead stripping to control whether to link prevailing symbols. Differential Revision: https://reviews.llvm.org/D33922 llvm-svn: 305482
* [PowerPC] fix potential verification errors on CFENCE8Hiroshi Inoue2017-06-153-12/+12
| | | | | | | | This patch fixes a potential verification error (64-bit register operands for cmpw) with -verify-machineinstrs. Differential Revision: https://reviews.llvm.org/D34208 llvm-svn: 305479
* [InstCombine] auto-generate complete checks; NFCSanjay Patel2017-06-151-50/+106
| | | | llvm-svn: 305474
* [X86][AVX2] Fix issue in lowerV8I16GeneralSingleInputVectorShuffle that was ↵Simon Pilgrim2017-06-151-0/+18
| | | | | | | | | | assuming v8i16 vectors We can use this with v16i16/v32i16 as well. Found during fuzz testing. llvm-svn: 305472
* Revert r305465: [X86][AVX512] Improve lowering of AVX512 compare intrinsics ↵Simon Pilgrim2017-06-154-13490/+50
| | | | | | | | (remove redundant shift left+right instructions). This is causing windows buildbot failures llvm-svn: 305470
* [X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove ↵Ayman Musa2017-06-154-50/+13490
| | | | | | | | | | | | | | | redundant shift left+right instructions). AVX512 compare instructions return v*i1 types. In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type. Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes. The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class. When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction. Differential Revision: https://reviews.llvm.org/D33188 llvm-svn: 305465
* [ScalarEvolution] Apply Depth limit to getMulExprMax Kazantsev2017-06-151-0/+44
| | | | | | | | | | | | | | | | | | | | | This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463
* [ARM] GlobalISel: Add support for i32 moduloDiana Picus2017-06-152-0/+96
| | | | | | | | | | | | | | | | | | Add support for modulo for targets that have hardware division and for those that don't. When hardware division is not available, we have to choose the correct libcall to use. This is generally straightforward, except for AEABI. The AEABI variant is trickier than the other libcalls because it returns { quotient, remainder }, instead of just one value like the other libcalls that we've seen so far. Therefore, we need to use custom lowering for it. However, we don't want to have too much special code, so we refactor the target-independent code in the legalizer by adding a helper for replacing an instruction with a libcall. This helper is used by the legalizer itself when dealing with simple calls, and also by the custom ARM legalization for the more complicated AEABI divmod calls. llvm-svn: 305459
* [ARM] GlobalISel: Lower only homogeneous struct argsDiana Picus2017-06-152-158/+45
| | | | | | | | | | | | | Lowering mixed struct args, params and returns used G_INSERT, which is a bit more convoluted to support through the entire pipeline. Since they don't occur that often in practice, it's probably wiser to leave them out until later. Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which has good support in the legalizer. These occur e.g. as the return of __aeabi_idivmod, so it's nice to be able to support them. llvm-svn: 305458
* [AArch64] Enable FeatureFuseAES for the generic processor model.Florian Hahn2017-06-151-36/+41
| | | | | | | | | | | | | | | | | | | | | | | Summary: Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back gives a double digit speedup on benchmarks using those instructions on Cortex-A processors. In GCC, this optimization is part of the generic processor model as well. This change should not have a major performance impact on processors that do not optimize AES instruction pairs, although I only had access to Cortex-A processors for benchmarking. Reviewers: rengolin, kristof.beyls, javed.absar, evandro, silviu.baranga, MatzeB, mcrosier, joelkevinjones, joel_k_jones, bmakam, t.p.northover Reviewed By: evandro Subscribers: sbaranga, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D33836 llvm-svn: 305457
* [mips][microMIPS] Extending size reduction pass with ADDIUSP and ADDIUR1SPZoran Jovanovic2017-06-151-0/+17
| | | | | | | | | | | | Author: milena.vujosevic.janicic Reviewers: sdardis The patch extends size reduction pass for MicroMIPS. The following instructions are examined and transformed, if possible: ADDIU instruction is transformed into 16-bit instruction ADDIUSP ADDIU instruction is transformed into 16-bit instruction ADDIUR1SP Differential Revision: https://reviews.llvm.org/D33887 llvm-svn: 305455
* [InstCombine] Add a test case to show a case where don't handle a partially ↵Craig Topper2017-06-151-0/+27
| | | | | | commuted IR. NFC llvm-svn: 305438
* Removal of accidental duplication in test assembly file. NFC.Wolfgang Pieb2017-06-141-250/+0
| | | | llvm-svn: 305431
* Fixing section name for Darwin platforms for sanitizer coverageGeorge Karpenkov2017-06-141-1/+1
| | | | | | On Darwin, section names have a 16char length limit. llvm-svn: 305429
* PredicateInfo: Don't insert conditional info when a conditional branch jumps ↵Daniel Berlin2017-06-142-0/+161
| | | | | | to the same target regardless of condition llvm-svn: 305416
* [EarlyCSE] Make PhiToCheck in removeMSSA() a set.Davide Italiano2017-06-141-0/+26
| | | | | | | | | | | This way we end up not looking at PHI args already removed. MemSSA now goes through the updater so we can prune it to avoid having redundant MemoryPHI arguments, but that doesn't quite work for the general case. Discussed with Daniel Berlin, fixes PR33406. llvm-svn: 305409
* MC, Object: Reserve a section type, SHT_LLVM_ODRTAB, for the ODR table.Peter Collingbourne2017-06-141-0/+12
| | | | | | | | | | | | | | This is part of the ODR checker proposal: http://lists.llvm.org/pipermail/llvm-dev/2017-June/113820.html Per discussion on the gnu-gabi mailing list [1] the section type range 0x6fff4c00..0x6fff4cff is reserved for LLVM. [1] https://sourceware.org/ml/gnu-gabi/2017-q2/msg00030.html Differential Revision: https://reviews.llvm.org/D33978 llvm-svn: 305407
* [ValueTracking] Correct early out in computeKnownBitsFromOperator to work ↵Craig Topper2017-06-141-0/+10
| | | | | | | | | | | | with non power of 2 bit widths There's an early out that's trying to detect when we don't know any bits that make up the legal range of a shift. The code subtracts one from BitWidth which creates a mask in the lower bits for power of 2 bit widths. This is then ANDed with the known bits to see if any of those bits are known. If the bit width isn't a power of 2 this creates a non-sensical mask. This patch corrects this by rounding up to a power of 2 before doing the subtract and mask. Differential Revision: https://reviews.llvm.org/D34165 llvm-svn: 305400
* Revert "[ARM] Support constant pools in data when generating execute-only code."Alexandros Lamprineas2017-06-141-50/+0
| | | | | | | | | | | This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0. I've upset a buildbot which runs the address sanitizer: ERROR: AddressSanitizer: stack-use-after-scope lib/Target/ARM/ARMISelLowering.cpp:2690 That Twine variable is used illegally. llvm-svn: 305390
* [mips] Fix multiprecision arithmetic.Simon Dardis2017-06-146-233/+444
| | | | | | | | | | | | | | | | | | | | | | | | | | For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC, get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs. For MIPS, only the DSP ASE has a carry flag, so in the general case it is not useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes. Also improve the generation code in such cases for targets with TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the comparison node rather than using it in selects. Similarly for ISD::SUBE / ISD::SUBC. Address optimization breakage by moving the generation of MIPS specific integer multiply-accumulate nodes to before legalization. This revolves PR32713 and PR33424. Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33494 llvm-svn: 305389
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-141-0/+50
| | | | | | | | | | | | | | The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305387
* Align definition of DW_OP_plus with DWARF spec [3/3]Florian Hahn2017-06-1421-38/+54
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things. The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack. This is done in three stages: • The first patch (LLVM) adds support for DW_OP_plus_uconst. • The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst. • The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions. Patch by Sander de Smalen. Reviewers: echristo, pcc, aprantl Reviewed By: aprantl Subscribers: fhahn, javed.absar, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D33894 llvm-svn: 305386
* [mips] Fix machine verifier errors in the long branch passSimon Dardis2017-06-141-20/+20
| | | | | | | | | | | | | | | This patch fixes two systemic machine verifier errors in the long branch pass. The first is the incorrect basic block successors and the second was the incorrect construction of several jump instructions. This partially resolves PR27458 and the associated PR32146. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33378 llvm-svn: 305382
* Revert r304907 as it is causing some failures that I cannot reproduce.Nemanja Ivanovic2017-06-145-567/+6
| | | | | | Reverting this until a test case can be provided to aid the investigation. llvm-svn: 305372
* Re-enable tests on power pc since the bug has been fixed.Eric Beckmann2017-06-142-8/+0
| | | | | | | | | | Summary: just flip them on. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34186 llvm-svn: 305345
OpenPOWER on IntegriCloud