bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	SafepointIRVerifier port to new Pass Manager	Fedor Sergeev	2019-03-31	3	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Straightforward port of StatepointIRVerifier pass to new Pass Manager framework. Fix By: skatkov Reviewed By: fedor.sergeev Differential Revision: https://reviews.llvm.org/D59825 This is a re-land of r357147/r357148 with LLVM_ENABLE_MODULES build fixed. Adding IR/SafepointIRVerifier.h into its own module. llvm-svn: 357361
*	[X86] Teach isel for RMW binops to handle negate	Craig Topper	2019-03-30	1	-2/+15
\| \| \| \| \| \| \| \|	Negate updates flags like a subtract. We should be able to use the flags from the RMW form of negate when we have (store (X86ISD::SUB 0, load A), A) Differential Revision: https://reviews.llvm.org/D60007 llvm-svn: 357353
*	[RISCV] Add codegen support for ilp32f, ilp32d, lp64f, and lp64d ("hard ↵	Alex Bradbury	2019-03-30	3	-17/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	float") ABIs This patch adds support for the RISC-V hard float ABIs, building on top of rL355771, which added basic target-abi parsing and MC layer support. It also builds on some re-organisations and expansion of the upstream ABI and calling convention tests which were recently committed directly upstream. A number of aspects of the RISC-V float hard float ABIs require frontend support (e.g. flattening of structs and passing int+fp for fp+fp structs in a pair of registers), and will be addressed in a Clang patch. As can be seen from the tests, it would be worthwhile extending RISCVMergeBaseOffsets to handle constant pool as well as global accesses. Differential Revision: https://reviews.llvm.org/D59357 llvm-svn: 357352
*	[X86][SSE] detectAVGPattern - Match zext(or(x,y)) 'add like' patterns (PR41316)	Simon Pilgrim	2019-03-30	1	-10/+21
\| \| \| \| \| \|	Fixes PR41316 where the expanded PAVG intrinsic had had one of its ADDs turned into an OR due to its operands having no conflicting bits. llvm-svn: 357351
*	[X86][SSE] detectAVGPattern - begin generalizing ADD matches	Simon Pilgrim	2019-03-30	1	-4/+15
\| \| \| \| \| \|	Move the ADD matching into a helper - first NFC stage towards supporting 'ADD like' cases such as in PR41316 llvm-svn: 357349
*	[WebAssembly] Fix unwind destination mismatches in CFG stackify	Heejin Ahn	2019-03-30	2	-19/+516
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Linearing the control flow by placing `try`/`end_try` markers can create mismatches in unwind destinations. This patch resolves these mismatches by wrapping those instructions with an incorrect unwind destination with a nested `try`/`catch`/`end_try` and branching to the right destination within the new catch block. Reviewers: dschuff Subscribers: sunfish, sbc100, jgravelle-google, chrib, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D48345 llvm-svn: 357343
*	[WebAssembly] Run ExplicitLocals pass after CFGStackify	Heejin Ahn	2019-03-30	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: While this does not change any final output, this will greatly simplify ixing unwind destination mismatches in CFGStackify (D48345), because we have to create some new registers there. Reviewers: dschuff Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59652 llvm-svn: 357342
*	[RISCV] Add DAGCombine for (SplitF64 (ConstantFP x))	Alex Bradbury	2019-03-30	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \|	The SplitF64 node is used on RV32D to convert an f64 directly to a pair of i32 (necessary as bitcasting to i64 isn't legal). When performed on a ConstantFP, this will result in a FP load from the constant pool followed by a store to the stack and two integer loads from the stack (necessary as there is no way to directly move between f64 FPRs and i32 GPRs on RV32D). It's always cheaper to just materialise integers for the lo and hi parts of the FP constant, so do that instead. llvm-svn: 357341
*	Adds `-ftime-trace` option to clang that produces Chrome `chrome://tracing` ↵	Anton Afanasyev	2019-03-30	3	-0/+197
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	compatible JSON profiling output dumps. This change adds hierarchical "time trace" profiling blocks that can be visualized in Chrome, in a "flame chart" style. Each profiling block can have a "detail" string that for example indicates the file being processed, template name being instantiated, function being optimized etc. This is taken from GitHub PR: https://github.com/aras-p/llvm-project-20170507/pull/2 Patch by Aras Pranckevičius. Differential Revision: https://reviews.llvm.org/D58675 llvm-svn: 357340
*	[WebAssembly] Optimize the number of routing blocks in FixIrreducibleCFG	Heejin Ahn	2019-03-30	1	-17/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently we create a routing block to the dispatch block for every predecessor of every entry. So the total number of routing blocks created will be (# of preds) * (# of entries). But we don't need to do this: we need at most 2 routing blocks per loop entry, one for when the predecessor is inside the loop and one for it is outside the loop. (We can't merge these into one because this will creates another loop cycle between blocks inside and blocks outside) This patch fixes this and creates at most 2 routing blocks per entry. This also renames variable `Split` to `Routing`, which I think is a bit clearer. Reviewers: kripken Subscribers: sunfish, dschuff, sbc100, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59462 llvm-svn: 357337
*	[Support] Implement is_local_impl with AIX mntctl	Hubert Tong	2019-03-29	1	-3/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: On AIX, we can determine whether a filesystem is remote using `mntctl`. If the information is not found, then claim that the file is remote (since that is the more restrictive case). Testing for the associated interface is restored with a modified version of the unit test from rL295768. Reviewers: jasonliu, xingxue Reviewed By: xingxue Subscribers: jsji, apaprocki, Hahnfeld, zturner, krytarowski, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58801 llvm-svn: 357333
*	[LoopPredication] Remove stale TODO	Philip Reames	2019-03-29	1	-2/+0
\| \| \| \|	llvm-svn: 357331
*	[LoopPredication] Use the builder's insertion point everywhere [NFC]	Philip Reames	2019-03-29	1	-11/+11
\| \| \| \|	llvm-svn: 357330
*	[MemorySSA] Temporary fix assert when reaching 0 limit.	Alina Sbirlea	2019-03-29	1	-2/+5
\| \| \| \|	llvm-svn: 357327
*	Try to fix buildbot error	Sanjoy Das	2019-03-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Error is: llvm/lib/Analysis/ScalarEvolution.cpp:3534:10: error: chosen constructor is explicit in copy-initialization return {UniqueSCEVs.FindNodeOrInsertPos(ID, IP), std::move(ID), IP}; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/bin/../lib/gcc/aarch64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here constexpr tuple(_UElements&&... __elements) ^ 1 error generated. llvm-svn: 357324
*	[WebAssembly] Add mutable globals feature	Thomas Lively	2019-03-29	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This feature is not actually used for anything in the WebAssembly backend, but adding it allows users to get it into the target features sections of their objects, which makes these objects future-compatible. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60013 llvm-svn: 357321
*	[SCEV] Check the cache in get{S\|U}MaxExpr before doing any work	Sanjoy Das	2019-03-29	1	-12/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This lets us avoid e.g. checking if A >=s B in getSMaxExpr(A, B) if we've already established that (A smax B) is the best we can do. Fixes PR41225. Reviewers: asbirlea Subscribers: mcrosier, jlebar, bixia, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60010 llvm-svn: 357320
*	[MemorySSA] Limit clobber walks.	Alina Sbirlea	2019-03-29	1	-21/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch limits all getClobberingMemoryAccess() walks to MaxCheckLimit. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59569 llvm-svn: 357319
*	[GlobalISel][AArch64] Add isel support for G_INSERT_VECTOR_ELT on v2s32s	Jessica Paquette	2019-03-29	2	-8/+46
\| \| \| \| \| \| \| \| \|	This adds support for v2s32 vector inserts, and updates the selection + regbankselect tests for G_INSERT_VECTOR_ELT. Differential Revision: https://reviews.llvm.org/D59910 llvm-svn: 357318
*	[X86] When using Win64 ABI, exit with error if SSE is disabled for varargs	Amara Emerson	2019-03-29	1	-0/+3
\| \| \| \| \| \| \| \|	We need XMM registers to handle varargs with the Win64 ABI. Before we would silently generate bad code resulting in an assertion failure elsewhere in the backend. llvm-svn: 357317
*	[MemorySSA] Don't optimize incomplete phis.	Alina Sbirlea	2019-03-29	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MemoryPhis cannot be optimized out until they are complete. Resolves PR41254. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59966 llvm-svn: 357315
*	[DAGCombiner] Rewrite ImproveLifetimeNodeChain to avoid DAG loop.	Nirav Dave	2019-03-29	1	-8/+9
\| \| \| \| \| \|	Avoid EXPENSIVE_CHECK failure. NFCI. llvm-svn: 357309
*	[WebAssembly] Handle END_LOOP in unreachable BB in CFGStackify	Heejin Ahn	2019-03-29	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes crashes when a BB in which an END_LOOP is to be placed is unreachable and does not have any predecessors. Fixes PR41307. Reviewers: dschuff Subscribers: yurydelendik, sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60004 llvm-svn: 357303
*	AMDGPU: Remove dx10-clamp from subtarget features	Matt Arsenault	2019-03-29	11	-30/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. llvm-svn: 357302
*	[DAG] Avoid redundancy in StoreMerge TokenFactor generation.	Nirav Dave	2019-03-29	1	-2/+2
\| \| \| \| \| \| \|	Avoid generating redundant TokenFactor when all merged stores have the same chain. llvm-svn: 357299
*	[X86] Use cached OptForSize in X86ISelDAGToDAG.cpp instead of pulling it ↵	Craig Topper	2019-03-29	1	-2/+1
\| \| \| \| \| \|	from the function attribute. NFCI llvm-svn: 357297
*	[llvm][NFC] Factor out logic for getting incoming & back Loop edges	Mircea Trofin	2019-03-29	1	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: davidxl Reviewed By: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59967 llvm-svn: 357284
*	[DAGCombine] Prune unnused nodes.	Nirav Dave	2019-03-29	1	-15/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Nodes that have no uses are eventually pruned when they are selected from the worklist. Record nodes newly added to the worklist or DAG and perform pruning after every combine attempt. Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight Reviewed By: jyknight Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58070 llvm-svn: 357283
*	[CodeGen] Refactor the option for the maximum jump table size	Evandro Menezes	2019-03-29	2	-4/+4
\| \| \| \| \| \| \|	Refactor the option `max-jump-table-size` to default to the maximum representable number. Essentially, NFC. llvm-svn: 357280
*	[DAG] Set up infrastructure to avoid smart constructor-based dangling nodes	Nirav Dave	2019-03-29	2	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations without fully pruning unused result values. This results in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter for the combiner to make sure such nodes have the chance of being pruned. This allows a number of additional peephole optimizations. Reviewers: efriedma, RKSimon, craig.topper, jyknight Reviewed By: jyknight Subscribers: msearles, jyknight, sdardis, nemanjai, javed.absar, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58068 llvm-svn: 357279
*	[InstCombine] move shuffle canonicalizations before other transforms	Sanjay Patel	2019-03-29	1	-30/+27
\| \| \| \| \| \| \| \| \|	This may not be NFC, but I'm not sure how to expose any diffs in tests. In theory, it should be slightly more efficient and possibly more profitable to do the canonicalizations (which can increase the undef elements in the mask) ahead of SimplifyDemandedVectorElts(). llvm-svn: 357272
*	[SLP] Add support for commutative icmp/fcmp predicates	Simon Pilgrim	2019-03-29	1	-14/+28
\| \| \| \| \| \| \| \| \| \|	For the cases where the icmp/fcmp predicate is commutative, use reorderInputsAccordingToOpcode to collect and commute the operands. This requires a helper to recognise commutativity in both general Instruction and CmpInstr types - the CmpInst::isCommutative doesn't overload the Instruction::isCommutative method for reasons I'm not clear on (maybe because its based on predicate not opcode?!?). Differential Revision: https://reviews.llvm.org/D59992 llvm-svn: 357266
*	[mips] Fix lowering a signed immediate for *.d MSA instructions	Simon Atanasyan	2019-03-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `lowerMSASplatImm` function zero-extends `i32` immediates while building constant. If target type is `i64`, negative immediate loses the sign. As a result, for example `__builtin_msa_ldi_d(-1)` lowered to series of instruction loads incorrect value 0xffffffff to the `$w0` register instead of single `ldi.d $w0, -1` instruction. The fix zero-extends unsigned immediates and signed-extend signed immediates. Differential Revision: http://reviews.llvm.org/D59884 llvm-svn: 357264
*	[AMDGPU][MC] Corrected conversion rules for inlinable constants to match ↵	Dmitry Preobrazhensky	2019-03-29	1	-15/+15
\| \| \| \| \| \| \| \| \| \| \| \|	rules for literals See bug 40806: https://bugs.llvm.org/show_bug.cgi?id=40806 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D59786 llvm-svn: 357262
*	[DAGCombiner] simplify shuffle of shuffle	Sanjay Patel	2019-03-29	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After investigating the examples from D59777 targeting an SSE4.1 machine, it looks like a very different problem due to how we map illegal types (256-bit in these cases). We're missing a shuffle simplification that maps elements of a vector back to a shuffled operand. We have a more general version of this transform in DAGCombiner::visitVECTOR_SHUFFLE(), but that generality means it is limited to patterns with a one-use constraint, and the examples here have 2 uses. We don't need any uses or legality limitations for a simplification (no new value is created). It looks like we miss this pattern in IR too. In one of the zext examples here, we have shuffle masks like this: Shuf0 = vector_shuffle<0,u,3,7,0,u,3,7> Shuf = vector_shuffle<4,u,6,7,u,u,u,u> ...so that's moving the high half of the 1st vector into the low half. But the high half of the 1st vector is already identical to the low half. Differential Revision: https://reviews.llvm.org/D59961 llvm-svn: 357258
*	Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."	Florian Hahn	2019-03-29	2	-39/+54
\| \| \| \| \| \| \| \| \|	Updated to use DenseMap::insert instead of [] operator for insertion, to avoid a crash caused by epoch checks. This reverts commit 2b85de438326f9d27bc96dc934ec98b98abdb337. llvm-svn: 357257
*	[DAGCombine] Improve Lifetime node chains.	Nirav Dave	2019-03-29	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve both start and end lifetime nodes chain dependencies. Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59795 llvm-svn: 357256
*	[DAGCombiner] fold sext into decrement	Sanjay Patel	2019-03-29	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a sibling to rL357178 that I noticed we'd hit if we chose an alternate transform in D59818. %z = zext i8 %x to i32 %dec = add i32 %z, -1 %r = sext i32 %dec to i64 => %z2 = zext i8 %x to i64 %r = add i64 %z2, -1 https://rise4fun.com/Alive/kPP The x86 vector diffs show a slight regression, so there's a chance that we should limit this and the previous transform to scalars. But given that we allowed vectors before, I'm matching that behavior here. We should change both transforms together if that's the right thing to do. llvm-svn: 357254
*	Switch lowering: exploit unreachable fall-through when lowering case range ↵	Hans Wennborg	2019-03-29	2	-3/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cluster In the example below, we would previously emit two range checks, one for cases 1--3 and one for 4--6. This patch makes us exploit the fact that the fall-through is unreachable and only one range check is necessary. switch i32 %i, label %default [ i32 1, label %bb1 i32 2, label %bb1 i32 3, label %bb1 i32 4, label %bb2 i32 5, label %bb2 i32 6, label %bb2 ] default: unreachable llvm-svn: 357252
*	[AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodes	Dmitry Preobrazhensky	2019-03-29	1	-7/+7
\| \| \| \| \| \| \| \| \| \|	See bug 40917: https://bugs.llvm.org/show_bug.cgi?id=40917 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D59878 llvm-svn: 357249
*	[MCA] Add an experimental MicroOpQueue stage.	Andrea Di Biagio	2019-03-29	3	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds an experimental stage named MicroOpQueueStage. MicroOpQueueStage can be used to simulate a hardware micro-op queue (basically, a decoupling queue between 'decode' and 'dispatch'). Users can specify a queue size, as well as a optional MaxIPC (which - in the absence of a "Decoders" stage - can be used to simulate a different throughput from the decoders). This stage is added to the default pipeline between the EntryStage and the DispatchStage only if PipelineOption::MicroOpQueue is different than zero. By default, llvm-mca sets PipelineOption::MicroOpQueue to the value of hidden flag -micro-op-queue-size. Throughput from the decoder can be simulated via another hidden flag named -decoder-throughput. That flag allows us to quickly experiment with different frontend throughputs. For targets that declare a loop buffer, flag -decoder-throughput allows users to do multiple runs, each time simulating a different throughput from the decoders. This stage can/will be extended in future. For example, we could add a "buffer full" event to notify bottlenecks caused by backpressure. flag -decoder-throughput would probably go away if in future we delegate to another stage (DecoderStage?) the simulation of a (potentially variable) throughput from the decoders. For now, flag -decoder-throughput is "good enough" to run some simple experiments. Differential Revision: https://reviews.llvm.org/D59928 llvm-svn: 357248
*	AMDGPU: Make sram-ecc off by default for Vega20	Konstantin Zhuravlyov	2019-03-29	1	-1/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D59718 llvm-svn: 357247
*	[X86] Add X86TargetLowering::isCommutativeBinOp override.	Simon Pilgrim	2019-03-29	2	-0/+13
\| \| \| \| \| \|	We currently just have test coverage for PMULUDQ - will add more in the future. llvm-svn: 357244
*	[SLP] Add support for swapping icmp/fcmp predicates to permit vectorization	Simon Pilgrim	2019-03-29	1	-9/+17
\| \| \| \| \| \| \| \|	We should be able to match elements with the swapped predicate as well - as long as we commute the source operands. Differential Revision: https://reviews.llvm.org/D59956 llvm-svn: 357243
*	[PowerPC] Add the support for __builtin_setrnd()	Kang Zhang	2019-03-29	2	-0/+140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode. double __builtin_setrnd(int mode); The effective values for mode are: 0 - round to nearest 1 - round to zero 2 - round to +infinity 3 - round to -infinity Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2). Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D59405 llvm-svn: 357241
*	[ScheduleDAG] Move `Topo` and `addEdge` to base class.	Clement Courbet	2019-03-29	4	-37/+29
\| \| \| \| \| \| \| \| \|	Some DAG mutations can only be applied to `ScheduleDAGMI`, and have to internally cast a `ScheduleDAGInstrs` to `ScheduleDAGMI`. There is nothing actually specific to `ScheduleDAGMI` in `Topo`. llvm-svn: 357239
*	AMDGPU/GlobalISel: Insert waterfall loop for vector indexing	Matt Arsenault	2019-03-29	2	-0/+174
\| \| \| \| \| \| \| \|	The register index can only really be an SGPR. Lie that a VGPR index is legal, and then rewrite the instruction in a waterfall loop to handle the index. llvm-svn: 357235
*	[PowerPC] Strength reduction of multiply by a constant by shift and add/sub ↵	Zi Xuan Wu	2019-03-29	2	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in place A shift and add/sub sequence combination is faster in place of a multiply by constant. Because the cycle or latency of multiply is not huge, we only consider such following worthy patterns. ``` (mul x, 2^N + 1) => (add (shl x, N), x) (mul x, -(2^N + 1)) => -(add (shl x, N), x) (mul x, 2^N - 1) => (sub (shl x, N), x) (mul x, -(2^N - 1)) => (sub x, (shl x, N)) ``` And the cycles or latency is subtarget-dependent so that we need consider the subtarget to determine to do or not do such transformation. Also data type is considered for different cycles or latency to do multiply. Differential Revision: https://reviews.llvm.org/D58950 llvm-svn: 357233
*	Revert Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."	Florian Hahn	2019-03-29	2	-54/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Another buildbot failure http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20402 clang-9: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/llvm/include/llvm/ADT/DenseMap.h:1228: llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type* llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::operator->() const [with KeyT = const llvm::Instruction; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<const llvm::Instruction>; Bucket = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; bool IsConst = false; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::pointer = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>]: Assertion `isHandleInSync() && "invalid iterator access!"' failed. 0. Program arguments: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/bin/clang-9 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name ArchiveCommandLine.cpp -mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu skylake-avx512 -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -coverage-notes-file /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip/Output/ArchiveCommandLine.llvm.gcno -resource-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0 -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/include -I ../../../include -D _GNU_SOURCE -D __STDC_LIMIT_MACROS -D NDEBUG -D BREAK_HANDLER -D UNICODE -D _UNICODE -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/C -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/myWindows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/include_windows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP -I . -D _FILE_OFFSET_BITS=64 -D _LARGEFILE_SOURCE -D NDEBUG -D _REENTRANT -D ENV_UNIX -D _7ZIP_LARGE_PAGES -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/backward -internal-isystem /usr/local/include -internal-isystem /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=gnu++98 -fdeprecated-macro -fdebug-compilation-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -ferror-limit 19 -fmessage-length 0 -pthread -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o Output/ArchiveCommandLine.llvm.o -x c++ /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/7zip/UI/Common/ArchiveCommandLine.cpp -faddrsig This reverts r357222 (git commit 64cccfcc72c44ea62f441b782d2177a90912769a) llvm-svn: 357227
*	[WebAssembly] Merge used feature sets, update atomics linkage policy	Thomas Lively	2019-03-29	7	-62/+156
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It does not currently make sense to use WebAssembly features in some functions but not others, so this CL adds an IR pass that takes the union of all used feature sets and applies it to each function in the module. This allows us to prevent atomics from being lowered away if some function has opted in to using them. When atomics is not enabled anywhere, we detect whether there exists any atomic operations or thread local storage that would be stripped and disallow linking with objects that contain atomics if and only if atomics or tls are stripped. When atomics is enabled, mark it as used but do not require it of other objects in the link. These changes allow libraries that do not use atomics to be built once and linked into both single-threaded and multithreaded binaries. Reviewers: aheejin, sbc100, dschuff Subscribers: jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59625 llvm-svn: 357226