summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [DAGCombine] Move AND nodes to multiple load leavesSam Parker2017-12-141-0/+124
| | | | | | | | | | | | | | | | | | | | | | | | Recommitting rL319773, which was reverted due to a recursive issue causing timeouts. This happened because I failed to check whether the discovered loads could be narrowed further. In the case of a tree with one or more narrow loads, that could not be further narrowed, as well as a node that would need masking, an AND could be introduced which could then be visited and recombined again with the same load. This could again create the masking load, with would be combined again... We now check that the load can be narrowed so that this process stops. Original commit message: Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320679
* [X86] Make ANY_EXTEND from vXi1 Custom for more types.Craig Topper2017-12-141-0/+6
| | | | | | We should be able to support ANY_EXTEND for any types we support ZERO_EXTEND for. llvm-svn: 320675
* [SelectionDAG][X86] Improve legalization of v32i1 CONCAT_VECTORS of v16i1 ↵Craig Topper2017-12-141-0/+15
| | | | | | | | | | for AVX512F. A v32i1 CONCAT_VECTORS of v16i1 uses promotion to v32i8 to legalize the v32i1. This results in a bunch of extract_vector_elts and a build_vector that ultimately gets scalarized. This patch checks to see if v16i8 is legal and inserts a any_extend to that so that we can concat v16i8 to v32i8 and avoid creating the extracts. llvm-svn: 320674
* [X86] Remove redundant setOperationAction calls.Craig Topper2017-12-141-2/+0
| | | | | | These calls already exist earlier under AVX2 feature. llvm-svn: 320673
* [LV] Support efficient vectorization of an induction with redundant castsDorit Nuzman2017-12-143-20/+257
| | | | | | | | | | | | | | | | | | | | | | | | | | | | D30041 extended SCEVPredicateRewriter to improve handling of Phi nodes whose update chain involves casts; PSCEV can now build an AddRecurrence for some forms of such phi nodes, under the proper runtime overflow test. This means that we can identify such phi nodes as an induction, and the loop-vectorizer can now vectorize such inductions, however inefficiently. The vectorizer doesn't know that it can ignore the casts, and so it vectorizes them. This patch records the casts in the InductionDescriptor, so that they could be marked to be ignored for cost calculation (we use VecValuesToIgnore for that) and ignored for vectorization/widening/scalarization (i.e. treated as TriviallyDead). In addition to marking all these casts to be ignored, we also need to make sure that each cast is mapped to the right vector value in the vector loop body (be it a widened, vectorized, or scalarized induction). So whenever an induction phi is mapped to a vector value (during vectorization/widening/ scalarization), we also map the respective cast instruction (if exists) to that vector value. (If the phi-update sequence of an induction involves more than one cast, then the above mapping to vector value is relevant only for the last cast of the sequence as we allow only the "last cast" to be used outside the induction update chain itself). This is the last step in addressing PR30654. llvm-svn: 320672
* [SelectionDAG] When legalizing the result type of CONCAT_VECTORS, take into ↵Craig Topper2017-12-141-3/+8
| | | | | | | | | | account whether the input type also needs to be promoted. If so go ahead and get the promoted input vector to extract from. Previously, we would create a bunch of any_extends of extract_vector_elts with illegal input type that needs to be promoted. The legalization of those extract_vector_elts would then potentially introduce a truncate. So now we have a bunch of any_extends of truncates. By legalizing both parts together we avoid creating these extra nodes. The test changes seem to be because we were previously combining the build_vector with the any_extend before the any_extend got combined with the truncate. llvm-svn: 320669
* MC/AsmPrinter: Reduce code duplication.Matthias Braun2017-12-143-42/+33
| | | | | | | | | | Factor out duplicated code emitting mach-o version-min specifiers. This should be NFC but happens to fix a bug where the code in MCMachoStreamer didn't take the version skew between darwin and macos versions into account. llvm-svn: 320666
* MC: Add support for mach-o build_versionMatthias Braun2017-12-145-81/+219
| | | | | | | | LC_BUILD_VERSION is a new load command superseding the previously used LC_XXX_MIN_VERSION commands. This adds an assembler directive along with encoding/streaming support. llvm-svn: 320661
* Recommit r320461 "[X86] Use regular expressions more aggressively to reduce ↵Craig Topper2017-12-134-1032/+48
| | | | | | | | | | | | | | the number of scheduler entries needed for FMA3 instructions." I've hopefully sidestepped the MSVC issue that caused it to be reverted. We no longer include the Sched enum from X86GenInstrInfo.inc on the X86 target. So hopefully MSVC's preprocessor will skip over it and nothing will notice the 11000 character enum name. Original commit message: When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models. This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do. llvm-svn: 320655
* [EarlyCSE] recognize swapped variants of abs/nabs as equivalentSanjay Patel2017-12-131-9/+12
| | | | | | | | Extends https://reviews.llvm.org/rL320640 Differential Revision: https://reviews.llvm.org/D41136 llvm-svn: 320653
* CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.valueYaxun Liu2017-12-133-11/+19
| | | | | | | | | | | | | | | | | Two issues were found about machine inst scheduler when compiling ProRender with -g for amdgcn target: GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it should not since DBG_VALUE is not mapped in LiveIntervals. when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D41132 llvm-svn: 320650
* [CodeView] Teach clang to emit the .debug$H COFF section.Zachary Turner2017-12-136-13/+186
| | | | | | | | | | | | | | | Currently this is an LLVM extension to the COFF spec which is experimental and intended to speed up linking. For now it is behind a hidden cl::opt flag, but in the future we can move it to a "real" cc1 flag and have the driver pass it through whenever it is appropriate. The patch to actually make use of this section in lld will come in a followup. Differential Revision: https://reviews.llvm.org/D40917 llvm-svn: 320649
* Recover some overzealously removed includes.Michael Zolotukhin2017-12-134-0/+4
| | | | llvm-svn: 320648
* Speculative build fix for lld on Linux after Michael's #include removalsHans Wennborg2017-12-131-0/+1
| | | | llvm-svn: 320645
* [WebAssembly] Use bitfield types in wasm YAML representationSam Clegg2017-12-131-0/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D41202 llvm-svn: 320642
* Reverting [JumpThreading] Preservation of DT and LVI across the passBrian M. Rzycki2017-12-135-365/+89
| | | | | | | Stage 2 bootstrap failed: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/14434 llvm-svn: 320641
* [EarlyCSE] recognize commuted and swapped variants of min/max as equivalent ↵Sanjay Patel2017-12-131-0/+27
| | | | | | | | | | | | | (PR35642) As shown in: https://bugs.llvm.org/show_bug.cgi?id=35642 ...we can have different forms of min/max, so we should recognize those here in EarlyCSE similar to how we already handle binops and compares that can commute. Differential Revision: https://reviews.llvm.org/D41136 llvm-svn: 320640
* Remove redundant includes from lib/Target/X86.Michael Zolotukhin2017-12-1312-31/+0
| | | | llvm-svn: 320636
* Remove redundant includes from lib/Target/ARM.Michael Zolotukhin2017-12-138-17/+0
| | | | llvm-svn: 320635
* Remove redundant includes from lib/Target/AArch64.Michael Zolotukhin2017-12-137-13/+0
| | | | llvm-svn: 320634
* Remove redundant includes from lib/Target/*.cpp.Michael Zolotukhin2017-12-133-5/+0
| | | | llvm-svn: 320633
* Remove redundant includes from various places.Michael Zolotukhin2017-12-134-7/+0
| | | | llvm-svn: 320629
* Remove redundant includes from lib/Transforms.Michael Zolotukhin2017-12-1331-48/+0
| | | | llvm-svn: 320628
* Remove redundant includes from lib/Support.Michael Zolotukhin2017-12-136-6/+0
| | | | llvm-svn: 320627
* Remove redundant includes from lib/ProfileData.Michael Zolotukhin2017-12-132-9/+0
| | | | llvm-svn: 320626
* Remove redundant includes from lib/Object.Michael Zolotukhin2017-12-133-13/+0
| | | | llvm-svn: 320625
* Remove redundant includes from lib/MC.Michael Zolotukhin2017-12-1310-19/+0
| | | | llvm-svn: 320624
* Remove redundant includes from lib/LTO.Michael Zolotukhin2017-12-133-9/+0
| | | | llvm-svn: 320623
* Remove redundant includes from lib/IR.Michael Zolotukhin2017-12-139-12/+0
| | | | llvm-svn: 320622
* Remove redundant includes from lib/ExecutionEngine.Michael Zolotukhin2017-12-133-4/+0
| | | | llvm-svn: 320621
* Remove redundant includes from lib/DebugInfo.Michael Zolotukhin2017-12-1324-33/+0
| | | | llvm-svn: 320620
* Remove redundant includes from lib/CodeGen.Michael Zolotukhin2017-12-1332-56/+0
| | | | llvm-svn: 320619
* Remove redundant includes from lib/Bitcode.Michael Zolotukhin2017-12-132-5/+0
| | | | llvm-svn: 320618
* Remove redundant includes from lib/Analysis.Michael Zolotukhin2017-12-1312-14/+0
| | | | llvm-svn: 320617
* AMDGPU: Partially fix disassembly of MIMG instructionsMatt Arsenault2017-12-138-78/+128
| | | | | | | | | | | | | | | | | | | | | Stores failed to decode at all since they didn't have a DecoderNamespace set. Loads worked, but did not change the register width displayed to match the numbmer of enabled channels. The number of printed registers for vaddr is still wrong, but I don't think that's encoded in the instruction so there's not much we can do about that. Image atomics are still broken. MIMG is the same encoding for SI/VI, but the image atomic classes are split up into encoding specific versions unlike every other MIMG instruction. They have isAsmParserOnly set on them for some reason. dmask is also special for these, so we probably should not have it as an explicit operand as it is now. llvm-svn: 320614
* [JumpThreading] Preservation of DT and LVI across the passBrian M. Rzycki2017-12-135-89/+365
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: See D37528 for a previous (non-deferred) version of this patch and its description. Preserves dominance in a deferred manner using a new class DeferredDominance. This reduces the performance impact of updating the DominatorTree at every edge insertion and deletion. A user may call DDT->flush() within JumpThreading for an up-to-date DT. This patch currently has one flush() at the end of runImpl() to ensure DT is preserved across the pass. LVI is also preserved to help subsequent passes such as CorrelatedValuePropagation. LVI is simpler to maintain and is done immediately (not deferred). The code to perfom the preversation was minimally altered and was simply marked as preserved for the PassManager to be informed. This extends the analysis available to JumpThreading for future enhancements. One example is loop boundary threading. Reviewers: dberlin, kuhar, sebpop Reviewed By: kuhar, sebpop Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40146 llvm-svn: 320612
* [GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases loadAditya Kumar2017-12-131-2/+2
| | | | | | | | | | | | | | w.r.t. the paper "A Practical Improvement to the Partial Redundancy Elimination in SSA Form" (https://sites.google.com/site/jongsoopark/home/ssapre.pdf) Proper dominance check was missing here, so having a loopinfo should not be required. Committing this diff as this fixes the bug, if there are further concerns, I'll be happy to work on them. Differential Revision: https://reviews.llvm.org/D39781 llvm-svn: 320607
* Ignore metainstructions during the shrink wrap analysisAdrian Prantl2017-12-131-0/+4
| | | | | | | | | | Shrink wrapping should ignore DBG_VALUEs referring to frame indices, since the presence of debug information must not affect code generation. Differential Revision: https://reviews.llvm.org/D41187 llvm-svn: 320606
* Fix link failure on one build bot introduced by r320584.Nemanja Ivanovic2017-12-131-1/+3
| | | | llvm-svn: 320589
* Reverted r320229. It broke tests on builder ↵Galina Kistanova2017-12-131-118/+4
| | | | | | llvm-clang-x86_64-expensive-checks-win. llvm-svn: 320588
* [PowerPC] MachineSSA pass to reduce the number of CR-logical operationsNemanja Ivanovic2017-12-135-0/+740
| | | | | | | | | | | | | | The initial implementation of an MI SSA pass to reduce cr-logical operations. Currently, the only operations handled by the pass are binary operations where both CR-inputs come from the same block and the single use is a conditional branch (also in the same block). Committing this off by default to allow for a period of field testing. Will enable it by default in a follow-up patch soon. Differential Revision: https://reviews.llvm.org/D30431 llvm-svn: 320584
* [X86] Add RDMSR/WRMSR, RDPMC + RDTSC/RDTSCP schedule testsSimon Pilgrim2017-12-133-1/+4
| | | | | | Add missing RDTSCP itinerary llvm-svn: 320581
* [RISCV] Define sfence.vma InstAliases to match the GNU RISC-V toolsAlex Bradbury2017-12-131-0/+3
| | | | | | | | Unfortunately these aren't defined explicitly in the privileged spec, but the GNU assembler does accept `sfence.vma` and `sfence.vma rs` as well as the usual `sfence.vma rs, rt`. llvm-svn: 320575
* [FuzzMutate] Only generate loads and stores to the first class sized typesIgor Laevsky2017-12-131-1/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D41109 llvm-svn: 320573
* [FuzzMutate] Correctly split landingpad blocksIgor Laevsky2017-12-131-2/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D41112 llvm-svn: 320571
* [X86][SSE] MOVMSK only uses the sign bit from each vector elementSimon Pilgrim2017-12-131-0/+22
| | | | | | | | | | Pass the input vector through SimplifyDemandedBits as we only need the sign bit from each vector element of MOVMSK We'd probably get more hits if SimplifyDemandedBits was better at handling vectors... Differential Revision: https://reviews.llvm.org/D41119 llvm-svn: 320570
* [RISCV] Implement floating point assembler pseudo instructionsAlex Bradbury2017-12-132-0/+46
| | | | | | | | | | | | | | | Adds the assembler aliases for the floating point instructions which can be mapped to a single canonical instruction. The missing pseudo instructions (flw, fld, fsw, fsd) are marked as TODO. Other things, like for example PCREL_LO, have to be implemented first. This patch builds upon D40902. Differential Revision: https://reviews.llvm.org/D41071 Patch by Mario Werner. llvm-svn: 320569
* Reintroduce r320049, r320014 and r319894.Igor Laevsky2017-12-132-0/+32
| | | | | | OpenGL issues should be fixed by now. llvm-svn: 320568
* [DAG] Promote ADDCARRY / SUBCARRYRoger Ferrer Ibanez2017-12-131-1/+24
| | | | | | | | Add missing case that was not implemented yet. Differential Revision: https://reviews.llvm.org/D38942 llvm-svn: 320567
* [CodeGen] Print jump-table index operands as %jump-table.0 in both MIR and ↵Francis Visoiu Mistrih2017-12-133-6/+9
| | | | | | | | | | debug output Work towards the unification of MIR and debug output by printing `%jump-table.0` instead of `<jt#0>`. Only debug syntax is affected. llvm-svn: 320566
OpenPOWER on IntegriCloud