bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Add test cases that could use PMADDUBSW.	Craig Topper	2018-07-31	1	-0/+2233
\| \| \| \|	llvm-svn: 338401
*	[X86] Preserve more liveness information in emitStackProbeInline	Francis Visoiu Mistrih	2018-07-31	2	-7/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes two issues with the liveness information after the call: 1) The code always spills RCX and RDX if InProlog == true, which results in an use of undefined phys reg. 2) FinalReg, JoinReg, RoundedReg, SizeReg are not added as live-ins to the basic blocks that use them, therefore they are seen undefined. https://llvm.org/PR38376 Differential Revision: https://reviews.llvm.org/D50020 llvm-svn: 338400
*	DAG: Fix PromoteFloatResult for fcanonicalize	Matt Arsenault	2018-07-31	1	-83/+101
\| \| \| \|	llvm-svn: 338382
*	AMDGPU: Fold undef fcanonicalize to qNaN	Matt Arsenault	2018-07-31	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	We could choose a free 0 for this, but this matches the behavior for fmul undef, 1.0. Also, the NaN use is more useful for folding use operations although if it's not eliminated it is more expensive in terms of code size. llvm-svn: 338376
*	AMDGPU: Fix test check line bugs	Matt Arsenault	2018-07-31	3	-23/+32
\| \| \| \|	llvm-svn: 338374
*	[SystemZ] Improve decoding in case of instructions with four register operands.	Jonas Paulsson	2018-07-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since z13, the max group size will be 2 if any μop has more than 3 register sources. This has been ignored sofar in the SystemZHazardRecognizer, but is now handled by recognizing those instructions and adjusting the tracking of decoding and the cost heuristic for grouping. Review: Ulrich Weigand https://reviews.llvm.org/D49847 llvm-svn: 338368
*	[ARM] Revert r337821	Sam Parker	2018-07-31	3	-11/+11
\| \| \| \| \| \| \|	Re-enabling ARMCodeGenPrepare by default after failing to reproduce the bootstrap issues that I was concerned it was causing. llvm-svn: 338354
*	[X86] Stop accidentally running the Bonnell LEA fixup path on Goldmont.	Craig Topper	2018-07-31	1	-1/+0
\| \| \| \| \| \|	In one place we checked X86Subtarget.slowLEA() to decide if the pass should run. But to decide what the pass should we only check isSLM. This resulted in Goldmont going down the Bonnell path. llvm-svn: 338342
*	[RISCV] Fixed test case failure due to r338047	Ana Pazos	2018-07-31	1	-1/+1
\| \| \| \|	llvm-svn: 338341
*	[AArch64][GlobalISel] Add isel support for G_BLOCK_ADDR.	Amara Emerson	2018-07-31	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \|	Also refactors some existing code to materialize addresses for the large code model so it can be shared between G_GLOBAL_VALUE and G_BLOCK_ADDR. This implements PR36390. Differential Revision: https://reviews.llvm.org/D49903 llvm-svn: 338337
*	[AArch64][GlobalISel] Make G_BLOCK_ADDR legal.	Amara Emerson	2018-07-31	1	-0/+45
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49902 llvm-svn: 338336
*	[GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants.	Amara Emerson	2018-07-31	1	-0/+12
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49900 llvm-svn: 338335
*	[DAGCombiner] transform sub-of-shifted-signbit to add	Sanjay Patel	2018-07-30	3	-36/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is exchanging a sub-of-1 with add-of-minus-1: https://rise4fun.com/Alive/plKAH This is another step towards improving select-of-constants codegen (see D48970). x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral. I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'. Differential Revision: https://reviews.llvm.org/D49924 llvm-svn: 338317
*	[MachineOutliner][AArch64] Add support for saving LR to a register	Jessica Paquette	2018-07-30	3	-9/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This teaches the outliner to save LR to a register rather than the stack when possible. This allows us to avoid bumping the stack in outlined functions in some cases. By doing this, in a later patch, we can teach the outliner to do something like this: f1: ... bl OUTLINED_FUNCTION ... f2: ... move LR's contents to a register bl OUTLINED_FUNCTION move the register's contents back instead of falling back to saving LR in both cases. llvm-svn: 338278
*	Add machine verifier to arm64-opt-remarks-lazy-bfi	Jessica Paquette	2018-07-30	1	-5/+8
\| \| \| \| \| \| \| \|	Previously, I thought this was a Windows failure. Then I realized it failed on every bot that used the verifier. This makes it use the verifier always, and adds that pass to the pipeline checks so that it's consistent across all bots. llvm-svn: 338272
*	[DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a ↵	David Bolvansky	2018-07-30	3	-123/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rotate can be formed Summary: Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed. This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op. Patch by: sameconrad (Sam Conrad) Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar Reviewed By: lebedev.ri Subscribers: efriedma, kparzysz, llvm-commits Differential Revision: https://reviews.llvm.org/D47681 llvm-svn: 338270
*	Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"	Thomas Preud'homme	2018-07-30	1	-0/+80
\| \| \| \| \| \| \| \| \| \| \| \|	This reapplies commit r338206 reverted by r338214 since the bug that r338206 uncovered has been fixed in r338268. Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338269
*	Fix uninitialized read in ARM's PrintAsmOperand	Thomas Preud'homme	2018-07-30	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix read of uninitialized RC variable in ARM's PrintAsmOperand when hasRegClassConstraint returns false. This was causing inline-asm-operand-implicit-cast test to fail in r338206. Reviewers: t.p.northover, weimingz, javed.absar, chill Reviewed By: chill Subscribers: chill, eraman, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49984 llvm-svn: 338268
*	Attempt to fix Windows test failure caused by r338133	Jessica Paquette	2018-07-30	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It seems like the pass pipeline on Windows is slightly different than on Linux and macOS. As a result, the arm64-opt-remarks-lazy-bfi test has been failing. This switches a CHECK-NEXT to a CHECK-DAG to try and get this running properly again. It'd be nice to switch it back to a CHECK-NEXT if possible, but the CHECK-NEXT lines following the line we care about (the optimization remark emitter) do a pretty good job of enforcing the ordering we want. Hopefully this works, since I don't have a Windows machine. ;) Example failure: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/11295 llvm-svn: 338267
*	[X86] Regenerate NOBMI/BMI combine-select tests.	Simon Pilgrim	2018-07-30	1	-34/+38
\| \| \| \| \| \|	Test cleanup for D38128 llvm-svn: 338265
*	[X86] Regenerate PKU test to merge 32/64-bit rdpkru checks	Simon Pilgrim	2018-07-30	1	-11/+5
\| \| \| \| \| \|	Test cleanup for D38128 llvm-svn: 338264
*	[X86] Regenerate fast-isel tests.	Simon Pilgrim	2018-07-30	3	-48/+20
\| \| \| \| \| \|	Test cleanup for D38128 llvm-svn: 338262
*	[Hexagon] Simplify A4_rcmp[n]eqi R, 0	Krzysztof Parzyszek	2018-07-30	1	-0/+154
\| \| \| \| \| \| \|	Consider cases when register R is known to be zero/non-zero, or when it is defined by a C2_muxii instruction. llvm-svn: 338251
*	AMDGPU: Reduce code size with fcanonicalize (fneg x)	Matt Arsenault	2018-07-30	4	-48/+71
\| \| \| \| \| \| \| \|	When fcanonicalize is lowered to a mul, we can use -1.0 for free and avoid the cost of the bigger encoding for source modifers. llvm-svn: 338244
*	AMDGPU: Make fneg combine handle fcanonicalize	Matt Arsenault	2018-07-30	1	-0/+21
\| \| \| \|	llvm-svn: 338243
*	[MachineOutliner][X86] Use TAILJMPd64 instead of JMP_1 for TailCall construction	Francis Visoiu Mistrih	2018-07-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The machine verifier asserts with: Assertion failed: (isMBB() && "Wrong MachineOperand accessor"), function getMBB, file ../include/llvm/CodeGen/MachineOperand.h, line 542. It calls analyzeBranch which tries to call getMBB if the opcode is JMP_1, but in this case we do: JMP_1 @OUTLINED_FUNCTION I believe we have to use TAILJMPd64 instead of JMP_1 since JMP_1 is used with brtarget8. Differential Revision: https://reviews.llvm.org/D49299 llvm-svn: 338237
*	AMDGPU: Force skip over s_sendmsg and exp instructions	Nicolai Haehnle	2018-07-30	3	-1/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These instructions interact with hardware blocks outside the shader core, and they can have "scalar" side effects even when EXEC = 0. We don't want these scalar side effects to occur when all lanes want to skip these instructions, so always add the execz skip branch instruction for basic blocks that contain them. Also ensure that we skip scalar stores / atomics, though we don't code-gen those yet. Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48431 Change-Id: Ieaeb58352e2789ffd64745603c14970c60819d44 llvm-svn: 338235
*	[ARM] Fix over-alignment in arguments that are HA of 128-bit vectors	Petr Pavlu	2018-07-30	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Code in `CC_ARM_AAPCS_Custom_Aggregate()` is responsible for handling homogeneous aggregates for `CC_ARM_AAPCS_VFP`. When an aggregate ends up fully on stack, the function tries to pack all resulting items of the aggregate as tightly as possible according to AAPCS. Once the first item was laid out, the alignment used for consecutive items was the size of one item. This logic went wrong for 128-bit vectors because their alignment is normally only 64 bits, and so could result in inserting unexpected padding between the first and second element. The patch fixes the problem by updating the alignment with the item size only if this results in reducing it. Differential Revision: https://reviews.llvm.org/D49720 llvm-svn: 338233
*	revert r338206 because the test does not pass	Sanjay Patel	2018-07-29	1	-80/+0
\| \| \| \| \| \| \|	Example of bot failure: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll llvm-svn: 338214
*	Fix crash on inline asm with 64bit matching input in 32bit GPR	Thomas Preud'homme	2018-07-28	1	-0/+80
\| \| \| \| \| \| \| \| \|	Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338206
*	AMDGPU: Stop wasting argument registers with v3i32/v3f32	Matt Arsenault	2018-07-28	5	-5/+170
\| \| \| \| \| \| \| \| \| \|	SelectionDAGBuilder widens v3i32/v3f32 arguments to to v4i32/v4f32 which consume an additional register. In addition to wasting argument space, this produces extra instructions since now it appears the 4th vector component has a meaningful value to most combines. llvm-svn: 338197
*	AMDGPU: Stop trying to extend arguments for clover	Matt Arsenault	2018-07-28	6	-265/+539
\| \| \| \| \| \| \|	This was trying to replace i8/i16 arguments with i32, which was broken and no longer necessary. llvm-svn: 338193
*	[DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)	Craig Topper	2018-07-28	10	-163/+163
\| \| \| \| \| \| \| \|	This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181
*	Revert "[WebAssembly] Added default stack-only instruction mode for MC."	Wouter van Oortmerssen	2018-07-27	79	-299/+103
\| \| \| \| \| \| \|	This reverts commit d3c9af4179eae7793d1487d652e2d4e23844555f. (SVN revision 338164) llvm-svn: 338176
*	[X86] Add support expanding multiplies by constant where the constant is ↵	Craig Topper	2018-07-27	3	-0/+294
\| \| \| \| \| \| \| \|	-3/-5/-9 multplied by a power of 2. These can be replaced with an LEA, a shift, and a negate. This seems to match what gcc and icc would do. llvm-svn: 338174
*	[WebAssembly] Added default stack-only instruction mode for MC.	Wouter van Oortmerssen	2018-07-27	79	-103/+299
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Moved Explicit Locals pass to last. Made that pass obligatory. Made it convert from register to stack based instructions, and removed the registers. Fixes to related code that was expecting register based instructions. Added the correct testing flag to all tests, depending on what the format they were expecting so far. Translated one test to stack format as example: reg-stackify-stack.ll tested: llvm-lit -v `find test -name WebAssembly` unittests/MC/* Reviewers: dschuff, sunfish Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D49160 llvm-svn: 338164
*	Recommit "Enable MachineOutliner by default under -Oz for AArch64"	Jessica Paquette	2018-07-27	7	-12/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed the ASAN failure from before in r338148, so recommiting. This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338160
*	[AArch64, PowerPC, x86] add more signbit math tests; NFC	Sanjay Patel	2018-07-27	3	-17/+87
\| \| \| \| \| \| \| \|	The tests with a constant sub operand were added with rL338143, but the potential transform doesn't have that requirement, so adding more tests with variable operands. llvm-svn: 338150
*	[ARM] Add new target feature to fuse literal generation	Evandro Menezes	2018-07-27	1	-0/+39
\| \| \| \| \| \| \| \| \| \|	This feature enables the fusion of such operations on Cortex A57 and Cortex A72, as recommended in their Software Optimisation Guides, sections 4.14 and 4.11, respectively. Differential revision: https://reviews.llvm.org/D49563 llvm-svn: 338147
*	[AArch64, PowerPC, x86] add more signbit math tests; NFC	Sanjay Patel	2018-07-27	3	-0/+76
\| \| \| \|	llvm-svn: 338143
*	Revert "Enable MachineOutliner by default under -Oz for AArch64"	Jessica Paquette	2018-07-27	7	-88/+12
\| \| \| \| \| \| \| \| \| \|	It failed an Asan test on a bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/21543/steps/check-llvm%20asan/logs/stdio Fixing that before recommitting. llvm-svn: 338136
*	bpf: add missing RegState to notify MachineInstr verifier necessary register ↵	Yonghong Song	2018-07-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	usage Errors like the following are reported by: https://urldefense.proofpoint.com/v2/url?u=http-3A__lab.llvm.org-3A8011_builders_llvm-2Dclang-2Dx86-5F64-2Dexpensive-2Dchecks-2Dwin_builds_11261&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=929oWPCf7Bf2qQnir4GBtowB8ZAlIRWsAdTfRkDaK-g&s=9k-wbEUVpUm474hhzsmAO29VXVvbxJPWD9RTgCD71fQ&e= * Bad machine code: Explicit definition marked as use * - function: cal_align1 - basic block: %bb.0 entry (0x47edd98) - instruction: LDB $r3, $r2, 0 - operand 0: $r3 This is because RegState info was missing for ScratchReg inside expandMEMCPY. This caused incomplete register usage information to MachineInstr verifier which then would complain as there could be potential code-gen issue if the complained MachineInstr is used in place where register usage information matters even though the memcpy expanding is not in such case as it happens at the last stage of IR optimization pipeline. We should always specify those register usage information which compiler couldn't deduct automatically whenever we add a hardware register manually. Reported-by: Builder llvm-clang-x86_64-expensive-checks-win Build #11261 Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 338134
*	Enable MachineOutliner by default under -Oz for AArch64	Jessica Paquette	2018-07-27	7	-12/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338133
*	[DAGCombiner] fold 'not' with signbit math	Sanjay Patel	2018-07-27	3	-59/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up suggested in D48970. Alive proofs: https://rise4fun.com/Alive/sII We can eliminate an instruction in the usual select-of-constants to bit hack transform by adjusting the add/sub with constant. This is always a win. There are more transforms that are likely wins, but they may need target hooks in case some targets do not benefit. This is another step towards making up for canonicalizing to select-of-constants in rL331486. llvm-svn: 338132
*	[x86] add more tests for signbit math; NFC	Sanjay Patel	2018-07-27	1	-0/+84
\| \| \| \|	llvm-svn: 338131
*	[PowerPC] add more tests for signbit math; NFC	Sanjay Patel	2018-07-27	1	-0/+96
\| \| \| \|	llvm-svn: 338130
*	[AArch64] add more tests for signbit math; NFC	Sanjay Patel	2018-07-27	1	-0/+81
\| \| \| \|	llvm-svn: 338129
*	AMDGPU/R600: Add MOV instructions to BFE patterns	Jan Vesely	2018-07-27	1	-0/+175
\| \| \| \| \| \| \| \| \|	R600 can't handle immediates for BFE, these will be eliminated later. Fixes powr/pow regressions n r600 since r334817 Differential Revision: https://reviews.llvm.org/D49641 llvm-svn: 338127
*	AMDGPU: Fix code size for return_to_epilog pseudo	Matt Arsenault	2018-07-27	1	-0/+6
\| \| \| \|	llvm-svn: 338113
*	AMDGPU/GlobalISel: Fix crash in regbankselect on non-power-of-2 types	Tom Stellard	2018-07-27	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D49624 llvm-svn: 338102