bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Power9] Add remaining __flaot128 builtin support for FMA round to odd	Stefan Pintilie	2018-07-11	1	-43/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement this as it is done on GCC: __float128 a, b, c, d; a = __builtin_fmaf128_round_to_odd (b, c, d); // generates xsmaddqpo a = __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsmsubqpo a = - __builtin_fmaf128_round_to_odd (b, c, d); // generates xsnmaddqpo a = - __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsnmsubpqp Differential Revision: https://reviews.llvm.org/D48218 llvm-svn: 336754
*	[test cases] add test cases for find more abs pattern	Chen Zheng	2018-07-11	1	-0/+87
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49123 llvm-svn: 336752
*	[ARM] Treat cmn immediates as legal in isLegalICmpImmediate.	Eli Friedman	2018-07-10	3	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	The original code attempted to do this, but the std::abs() call didn't actually do anything due to implicit type conversions. Fix the type conversions, and perform the correct check for negative immediates. This probably has very little practical impact, but it's worth fixing just to avoid confusion in the future, I think. Differential Revision: https://reviews.llvm.org/D48907 llvm-svn: 336742
*	[X86] Teach X86InstrInfo::commuteInstructionImpl to use MOVSD/MOVSS for ↵	Craig Topper	2018-07-10	2	-8/+8
\| \| \| \| \| \| \| \| \| \|	BLEND under optsize when the immediate allows it. Isel currently emits movss/movsd a lot of the time and an accidental double commute turns it into a blend. Ideally we'd select blend directly in isel under optspeed and not rely on the double commute to create blend. llvm-svn: 336731
*	[ThinLTO] Use std::map to get determistic imports files	Teresa Johnson	2018-07-10	2	-6/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I noticed that the .imports files emitted for distributed ThinLTO backends do not have consistent ordering. This is because StringMap iteration order is not guaranteed to be deterministic. Since we already have a std::map with this information, used when emitting the individual index files (ModuleToSummariesForIndex), use it for the imports files as well. This issue is likely causing some unnecessary rebuilds of the ThinLTO backends in our distributed build system as the imports files are inputs to those backends. Reviewers: pcc, steven_wu, mehdi_amini Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D48783 llvm-svn: 336721
*	[GlobalISel][X86_64] Support for G_SITOFP	Alexander Ivchenko	2018-07-10	3	-0/+625
\| \| \| \| \| \|	The instruction selection is automatically handled by tablegen llvm-svn: 336703
*	[Evaluator] Examine alias when evaluating function call	Eugene Leviant	2018-07-10	1	-1/+3
\| \| \| \| \| \|	This fixes PR38120 llvm-svn: 336702
*	[DAGCombiner] Add special case fast paths for udiv x,1 and udiv x,-1	Simon Pilgrim	2018-07-10	1	-35/+12
\| \| \| \| \| \|	udiv x,-1 was going down the (slow) BuildUDIV route resulting in unnecessary shifts. llvm-svn: 336701
*	AMDGPU: Make hidden argument metadata consistent with	Konstantin Zhuravlyov	2018-07-10	3	-46/+262
\| \| \| \| \| \| \| \|	amdgpu-implicitarg-num-bytes attribute Differential Revision: https://reviews.llvm.org/D49096 llvm-svn: 336697
*	[InstCombine] allow flag propagation when using safe constant	Sanjay Patel	2018-07-10	1	-7/+4
\| \| \| \| \| \| \|	This corresponds with the code for the single binop pattern added in rL336684. llvm-svn: 336696
*	[X86] Add srem/udiv/urem by constant tests	Simon Pilgrim	2018-07-10	3	-0/+239
\| \| \| \| \| \|	Match the tests in combine-sdiv.ll llvm-svn: 336694
*	[WebAssembly] Add missing a few {{$}}s to a test	Heejin Ahn	2018-07-10	1	-5/+5
\| \| \| \|	llvm-svn: 336691
*	AMDGPU/NFC: Fix typo in test name	Konstantin Zhuravlyov	2018-07-10	1	-0/+0
\| \| \| \| \| \| \|	hsa-metadata-enqueu-kernel.ll -> hsa-metadata-enqueue-kernel.ll llvm-svn: 336689
*	Update test to work on Windows	Paul Robinson	2018-07-10	1	-1/+1
\| \| \| \|	llvm-svn: 336687
*	[InstCombine] safely allow non-commutative binop identity constant folds	Sanjay Patel	2018-07-10	1	-28/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally intended with D48893, but as discussed there, we have to make the folds safe from producing extra poison. This should give the single binop folds the same capabilities as the existing folds for 2-binops+shuffle. LLVM binary opcode review: there are a total of 18 binops. There are 7 commutative binops (add, mul, and, or, xor, fadd, fmul) which we already fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr, fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't bother with sub/fsub with constant operand 1 because those are canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18. llvm-svn: 336684
*	[Hexagon] Change .mir testcase to make sure function is not in SSA form	Krzysztof Parzyszek	2018-07-10	1	-0/+1
\| \| \| \| \| \| \| \| \|	If a machine function satisfies SSA, the IsSSA property is assumed even if the pass to be executed runs after existing from SSA. If the pass output then does not conform to SSA, a verifier error will be flagged (with expensive checks enabled). llvm-svn: 336682
*	Support -fdebug-prefix-map in llvm-mc. This is useful to omit the	Paul Robinson	2018-07-10	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \|	debug compilation dir when compiling assembly files with -g. Part of PR38050. Patch by Siddhartha Bagaria! Differential Revision: https://reviews.llvm.org/D48988 llvm-svn: 336680
*	[InstCombine] drop poison flags when shuffle mask undef propagates to constant	Sanjay Patel	2018-07-10	1	-4/+3
\| \| \| \|	llvm-svn: 336679
*	[AArch64][SVE] Asm: Support for predicated unary operations.	Sander de Smalen	2018-07-10	14	-0/+340
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the following instructions: CLS (Count Leading Sign bits) CLZ (Count Leading Zeros) CNT (Count non-zero bits) CNOT (Logically invert boolean condition in vector) NOT (Bitwise invert vector) FABS (Floating-point absolute value) FNEG (Floating-point negate) All operations are predicated and unary, e.g. clz z0.s, p0/m, z1.s - CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32 and 64 bit elements. - FABS and FNEG have variants for 16, 32 and 64 bit elements. llvm-svn: 336677
*	Reapply "AMDGPU: Force inlining if LDS global address is used"	Matt Arsenault	2018-07-10	3	-2/+87
\| \| \| \| \| \|	This reverts commit r336623 llvm-svn: 336675
*	[InstCombine] allow more shuffle-binop folds with safe constants	Sanjay Patel	2018-07-10	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \|	The case with 2 variables is more complicated than the case where we eliminate the shuffle entirely because a shuffle with an undef mask element creates an undef result. I'm not aware of any current analysis/transform that recognizes that undef propagating to a div/rem/shift, but we have to guard against the possibility. llvm-svn: 336668
*	[DebugInfo][LoopVectorize] Preserve DL in induction PHI and Add	Anastasis Grammenos	2018-07-10	1	-0/+10
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48968 llvm-svn: 336667
*	[Hexagon] Add implicit uses even when untied explicit uses are present	Krzysztof Parzyszek	2018-07-10	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An explicit untied use is not sufficient to maintain liveness of a register redefined in a predicated instruction. For example %1 = COPY %0 ... %1 = A2_paddif %2, %1, 1 could become $r1 = COPY $r0 ... $r1 = A2_paddif $p0, $r1, 1 and later $r1 = COPY $r0 ;; this is not really dead! ... $r1 = A2_paddif $p0, $r0, 1 llvm-svn: 336662
*	[LowerSwitch] Fixed faulty PHI nodes	Karl-Johan Karlsson	2018-07-10	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixed two cases of where PHI nodes need to be updated by lowerswitch. When lowerswitch find out that the switch default branch is not reachable it remove the old default and replace it with the most popular block from the cases, but it forget to update the PHI nodes in the default block. The PHI nodes also need to be updated when the switch is replaced with a single branch. Reviewers: hans, reames, arsenm Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D47203 llvm-svn: 336659
*	[PM/Unswitch] Fix a collection of closely related issues with trivial	Chandler Carruth	2018-07-10	3	-8/+280
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	switch unswitching. The core problem was that the way we handled unswitching trivial exit edges through the default successor of a switch. For some reason I thought the right way to do this was to add a block containing unreachable and point the default successor at this block. In retrospect, this has an amazing number of problems. The first issue is the one that this pass has always worked around -- we have to detect such edges and avoid unswitching them again. This seemed pretty easy really. You juts look for an edge to a block containing unreachable. However, this pattern is woefully unsound. So many things can break it. The amazing thing is that I found a test case where simple-loop-unswitch itself breaks this! When we do a non-trivial unswitch of a switch we will end up splitting this exit edge. The result will be a default successor that is an exit and terminates in ... a perfectly normal branch. So the first test case that I started trying to fix is added to the nontrivial test cases. This is a ridiculous example that did just amazing things previously. With just unswitch, it would create 10+ copies of this stuff stamped out. But if you combine it just right with a bunch of other passes (like simplify-cfg, loop rotate, and some LICM) you can get it to do this infinitely. Or at least, I never got it to finish. =[ This, in turn, uncovered another related issue. When we are manipulating these switches after doing a trivial unswitch we never correctly updated PHI nodes to reflect our edits. As soon as I started changing how these edges were managed, it became obvious there were more issues that I couldn't realistically leave unaddressed, so I wrote more test cases around PHI updates here and ensured all of that works now. And this, in turn, required some adjustment to how we collect and manage the exit successor when it is the default successor. That showed a clear bug where we failed to include it in our search for the outer-most loop reached by an unswitched exit edge. This was actually already tested and the test case didn't work. I (wrongly) thought that was due to SCEV failing to analyze the switch. In fact, it was just a simple bug in the code that skipped the default successor. While changing this, I handled it correctly and have updated the test to reflect that we now get precise SCEV analysis of trip counts for the outer loop in one of these cases. llvm-svn: 336646
*	[X86] Fast-isel tests for lowered truncation intrinsics	Mikhail Dvoretckii	2018-07-10	2	-0/+115
\| \| \| \| \| \| \| \| \|	This patch adds fast-isel tests for the IR patterns produced for truncation intrinsics in rC336643. Differential Revision: https://reviews.llvm.org/D48822 llvm-svn: 336645
*	[X86][SSE] Prefer BLEND(SHL(v,c1),SHL(v,c2)) over MUL(v, c3)	Simon Pilgrim	2018-07-10	4	-40/+48
\| \| \| \| \| \| \| \| \| \|	Now that rL336250 has landed, we should prefer 2 immediate shifts + a shuffle blend over performing a multiply. Despite the increase in instructions, this is quicker (especially for slow v4i32 multiplies), avoid loads and constant pool usage. It does mean however that we increase register pressure. The code size will go up a little but by less than what we save on the constant pool data. This patch also adds support for v16i16 to the BLEND(SHIFT(v,c1),SHIFT(v,c2)) combine, and also prevents blending on pre-SSE41 shifts if it would introduce extra blend masks/constant pool usage. Differential Revision: https://reviews.llvm.org/D48936 llvm-svn: 336642
*	[X86] Regenerate vector-shuffle-512-v8.ll so the script will merge the 32 ↵	Craig Topper	2018-07-10	1	-861/+387
\| \| \| \| \| \|	and 64 bit checks together. NFC llvm-svn: 336641
*	[X86] Correct vfixupimm load patterns to look for an integer load, not a ↵	Craig Topper	2018-07-10	1	-4/+2
\| \| \| \| \| \| \| \|	floating point load bitcasted to integer. DAG combine wouldn't let a floating point load bitcasted to integer exist. It would just be an integer load. llvm-svn: 336626
*	[X86] Add test cases that show failure to fold load into vfixupimm ↵	Craig Topper	2018-07-10	1	-1/+23
\| \| \| \| \| \|	instructions due to bad isel pattern. llvm-svn: 336625
*	Revert "AMDGPU: Force inlining if LDS global address is used"	Vlad Tsyrklevich	2018-07-10	3	-87/+2
\| \| \| \| \| \| \|	This reverts commit r336587, it was causing test failures on the sanitizer bots. llvm-svn: 336623
*	[InstCombine] allow more shuffle folds using safe constants	Sanjay Patel	2018-07-09	1	-8/+4
\| \| \| \| \| \| \| \| \| \|	getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming that the constant we wanted was operand 1 (RHS). That's wrong, but I don't think we could expose a bug or even a suboptimal fold from that because the callers have other guards for any binop that would have been affected. llvm-svn: 336617
*	[WebAssembly] Support for binary atomic RMW instructions	Heejin Ahn	2018-07-09	2	-0/+1274
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for binary atomic read-modify-write instructions: add, sub, and, or, xor, and xchg. This does not yet support translations of some of LLVM IR atomicrmw instructions (nand, max, min, umax, and umin) that do not have a direct counterpart in wasm instructions. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49088 llvm-svn: 336615
*	llvm: Add support for "-fno-delete-null-pointer-checks"	Manoj Gupta	2018-07-09	44	-2/+1485
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613
*	Make llvm.objectsize more conservative with null	George Burgess IV	2018-07-09	2	-5/+23
\| \| \| \| \| \| \| \| \| \|	In non-zero address spaces, we were reporting that an object at `null` always occupies zero bytes. This is incorrect in many cases, so just return `unknown` in those cases for now. Differential Revision: https://reviews.llvm.org/D48860 llvm-svn: 336611
*	Fix line endings. NFCI.	Simon Pilgrim	2018-07-09	3	-177/+177
\| \| \| \|	llvm-svn: 336602
*	[Power9] Add __float128 builtins for Rounding Operations	Stefan Pintilie	2018-07-09	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added __float128 support for a number of rounding operations: trunc rint nearbyint round floor ceil Differential Revision: https://reviews.llvm.org/D48415 llvm-svn: 336601
*	[WebAssembly] Improve readability of load/stores and tests. NFC.	Heejin Ahn	2018-07-09	4	-348/+659
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Changed variable/function names to be more consistent - Improved comments in test files - Added more tests - Fixed a few typos - Misc. cosmetic changes Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49087 llvm-svn: 336598
*	[Power9] [LLVM] Add __float128 support for trunc to double round to odd	Stefan Pintilie	2018-07-09	1	-0/+10
\| \| \| \| \| \| \| \| \|	Add support for this builtin: double builtin_truncf128_round_to_odd(float128) Differential Revision: https://reviews.llvm.org/D48483 llvm-svn: 336595
*	RenameIndependentSubregs: Fix handling of undef tied operands	Mark Searles	2018-07-09	1	-0/+18
\| \| \| \| \| \| \| \| \|	Ensure that, if updating a tied operand pair, to only update that pair. Differential Revision: https://reviews.llvm.org/D49052 llvm-svn: 336593
*	[globalisel][irtranslator] Add support for atomicrmw and (strong) cmpxchg	Daniel Sanders	2018-07-09	1	-0/+265
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support for the atomicrmw instructions and the strong cmpxchg instruction to the IRTranslator. I've left out weak cmpxchg because LangRef.rst isn't entirely clear on what difference it makes to the backend. As far as I can tell from the code, it only matters to AtomicExpandPass which is run at the LLVM-IR level. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, javed.absar Reviewed By: qcolombet Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D40092 llvm-svn: 336589
*	AMDGPU: Force inlining if LDS global address is used	Matt Arsenault	2018-07-09	3	-2/+87
\| \| \| \| \| \| \| \| \| \|	These won't work for the forseeable future. These aren't allowed from OpenCL, but IPO optimizations can make them appear. Also directly set the attributes on functions, regardless of the linkage rather than cloning functions like before. llvm-svn: 336587
*	[X86][TLI] DAGCombine: Unfold variable bit-clearing mask to two shifts.	Roman Lebedev	2018-07-09	3	-485/+487
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds a reverse transform for the instcombine canonicalizations that were added in D47980, D47981. As discussed later, that was worse at least for the code size, and potentially for the performance, too. https://rise4fun.com/Alive/Zmpl Reviewers: craig.topper, RKSimon, spatel Reviewed By: spatel Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D48768 llvm-svn: 336585
*	[Power9] Add __float128 builtins for Round To Odd	Stefan Pintilie	2018-07-09	1	-0/+82
\| \| \| \| \| \| \| \| \| \| \| \|	GCC has builtins for these round to odd instructions: __float128 __builtin_sqrtf128_round_to_odd (__float128) __float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128) __float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128) Differential Revision: https://reviews.llvm.org/D47550 llvm-svn: 336578
*	Add bitcode compatibility test for 6.0	Steven Wu	2018-07-09	2	-0/+1716
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add bitcode compatibility test for 6.0. On top of the normal disassemble test, also runs the verifier to make sure simple 6.0 bitcode can pass the current IR verifier. Reviewers: vsk Reviewed By: vsk Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49086 llvm-svn: 336574
*	[InstCombine] correct test comments; NFC	Sanjay Patel	2018-07-09	1	-2/+2
\| \| \| \|	llvm-svn: 336570
*	[X86] In combineFMA, make sure we bitcast the result of isFNEG back the ↵	Craig Topper	2018-07-09	1	-0/+28
\| \| \| \| \| \| \| \|	expected type before creating the new FMA node. Previously, we were creating malformed SDNodes, but nothing noticed because the type constraints prevented isel from noticing. llvm-svn: 336566
*	[X86][AVX] Regenerate AVX1 fast-isel tests.	Simon Pilgrim	2018-07-09	1	-1900/+1181
\| \| \| \| \| \|	Let the update script merge 32/64 tests where possible llvm-svn: 336565
*	[InstCombine] avoid extra poison when moving shift above shuffle	Sanjay Patel	2018-07-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	As discussed in D49047 / D48987, shift-by-undef produces poison, so we can't use undef vector elements in that case.. Note that we need to extend this for poison-generating flags, and there's a proposal to create poison from FMF in D47963, llvm-svn: 336562
*	[dsymutil] Add support for outputting assembly	Jonas Devlieghere	2018-07-09	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	When implementing the DWARF accelerator tables in dsymutil I ran into an assertion in the assembler. Debugging these kind of issues is a lot easier when looking at the assembly instead of debugging the assembler itself. Since it's only a matter of creating an AsmStreamer instead of a MCObjectStreamer it made sense to turn this into a (hidden) dsymutil feature. Differential revision: https://reviews.llvm.org/D49079 llvm-svn: 336561