bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ModuloSchedule] removeBranch() before creating the trip count condition	James Molloy	2019-10-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Hexagon code assumes there's no existing terminator when inserting its trip count condition check. This causes swp-stages5.ll to break. The generated code looks good to me, it is likely a permutation. I have disabled the new codegen path to keep everything green and will investigate along with the other 3-4 tests that have different codegen. Fixes expensive-checks build. llvm-svn: 373629
*	[ModuloSchedule] Peel out prologs and epilogs, generate actual code	James Molloy	2019-10-02	55	-61/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This extends the PeelingModuloScheduleExpander to generate prolog and epilog code, and correctly stitch uses through the prolog, kernel, epilog DAG. The key concept in this patch is to ensure that all transforms are local; only a function of a block and its immediate predecessor and successor. By defining the problem in this way we can inductively rewrite the entire DAG using only local knowledge that is easy to reason about. For example, we assume that all prologs and epilogs are near-perfect clones of the steady-state kernel. This means that if a block has an instruction that is predicated out, we can redirect all users of that instruction to that equivalent instruction in our immediate predecessor. As all blocks are clones, every instruction must have an equivalent in every other block. Similarly we can make the assumption by construction that if a value defined in a block is used outside that block, the only possible user is its immediate successors. We maintain this even for values that are used outside the loop by creating a limited form of LCSSA. This code isn't small, but it isn't complex. Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet; I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns that we don't produce. Those still need a bit more investigation. In the meantime we (Google) are happy with the code produced by this on our downstream SMS implementation, and believe it generates correct code. Subscribers: mgorny, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68205 llvm-svn: 373462
*	[Hexagon] Bitcast v4i16 to v8i8, unify no-op casts between scalar and HVX	Krzysztof Parzyszek	2019-09-23	1	-0/+13
\| \| \| \|	llvm-svn: 372616
*	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount	James Molloy	2019-09-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recommit: fix asan errors. The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372463
*	Revert "[MachinePipeliner] Improve the TargetInstrInfo API ↵	Mitch Phillips	2019-09-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	analyzeLoop/reduceLoopCount" This commit broke the ASan buildbot. See comments in rL372376 for more information. This reverts commit 15e27b0b6d9d51362fad85dbe95ac5b3fadf0a06. llvm-svn: 372425
*	[MVT] Add v256i1 to MachineValueType	Krzysztof Parzyszek	2019-09-20	1	-0/+15
\| \| \| \| \| \|	This type can show up when lowering some HVX vector code on Hexagon. llvm-svn: 372403
*	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount	James Molloy	2019-09-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372376
*	[Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes ↵	Guillaume Chatelet	2019-09-11	6	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608
*	[DFAPacketizer] Reapply: Track resources for packetized instructions	James Molloy	2019-09-09	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reapply with fix to reduce resources required by the compiler - use unsigned[2] instead of std::pair. This causes clang and gcc to compile the generated file multiple times faster, and hopefully will reduce the resource requirements on Visual Studio also. This fix is a little ugly but it's clearly the same issue the previous author of DFAPacketizer faced (the previous tables use unsigned[2] rather uglily too). This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 llvm-svn: 371399
*	Revert rL371198 from llvm/trunk: [DFAPacketizer] Track resources for ↵	Simon Pilgrim	2019-09-09	1	-29/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	packetized instructions This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 ........ Reverted as this is causing "compiler out of heap space" errors on MSVC 2017/19 NDEBUG builds llvm-svn: 371393
*	[DFAPacketizer] Track resources for packetized instructions	James Molloy	2019-09-06	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 llvm-svn: 371198
*	[Hexagon] Fix type in HexagonTargetLowering::ReplaceNodeResults	Krzysztof Parzyszek	2019-09-05	1	-0/+18
\| \| \| \|	llvm-svn: 371083
*	[Hexagon] Improve generated code for test-if-bit-clear, one more time	Krzysztof Parzyszek	2019-09-04	2	-36/+11
\| \| \| \| \| \|	Adjust isel patterns after recent commit. Fixes https://llvm.org/PR43194. llvm-svn: 370913
*	[MachinePipeliner] Add a way to unit-test the schedule emitter	James Molloy	2019-09-03	1	-0/+151
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself. One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway. This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at. We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with: llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir And run the emission in isolation with: llc < test.mir -run-pass=modulo-schedule-test llvm-svn: 370705
*	[DAGCombiner] try to form test+set out of shift+mask patterns	Sanjay Patel	2019-09-02	1	-11/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivating bugs are: https://bugs.llvm.org/show_bug.cgi?id=41340 https://bugs.llvm.org/show_bug.cgi?id=42697 As discussed there, we could view this as a failure of IR canonicalization, but then we would need to implement a backend fixup with target overrides to get this right in all cases. Instead, we can just view this as a codegen opportunity. It's not even clear for x86 exactly when we should favor test+set; some CPUs have better theoretical throughput for the ALU ops than bt/test. This patch is made more complicated than I expected because there's an early DAGCombine for 'and' that can change types of the intermediate ops via trunc+anyext. Differential Revision: https://reviews.llvm.org/D66687 llvm-svn: 370668
*	Revert [MBP] Disable aggressive loop rotate in plain mode	Jordan Rupprecht	2019-08-29	4	-4/+6
\| \| \| \| \| \| \| \|	This reverts r369664 (git commit 51f48295cbe8fa3a44db263b528dd9f7bae7bf9a) It causes many benchmark regressions, internally and in llvm's benchmark suite. llvm-svn: 370398
*	[Hexagon] Improve generated code for test-if-bit-clear	Krzysztof Parzyszek	2019-08-26	1	-21/+12
\| \| \| \|	llvm-svn: 369947
*	[Hexagon] remove noise from tests; NFC	Sanjay Patel	2019-08-25	1	-15/+10
\| \| \| \|	llvm-svn: 369875
*	[Hexagon][x86] add tests for bit-test; NFC	Sanjay Patel	2019-08-25	1	-1/+98
\| \| \| \| \| \| \|	More coverage for D66687 (assuming we make this a generic combine with TLI hook). llvm-svn: 369874
*	[MBP] Disable aggressive loop rotate in plain mode	Guozhi Wei	2019-08-22	4	-6/+4
\| \| \| \| \| \| \| \| \| \|	Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664
*	[test] Fix tests when run on windows after SVN r369426. NFC.	Martin Storsjo	2019-08-20	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When running tests on windows, invoking "llc -march=<arch>" will implicitly use windows as the target os, making these tests misbehave after this change. Fix the issue by using more specific -mtriple values instead of plain -march in these tests. This should hopefully fix buildbot failures like http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/9816. llvm-svn: 369443
*	[CodeGen] Add EarlyIfConvert test missed in previous commit	Thomas Raoux	2019-08-20	1	-0/+81
\| \| \| \|	llvm-svn: 369405
*	[Hexagon] Generate min/max instructions for 64-bit vectors	Krzysztof Parzyszek	2019-08-16	2	-1/+203
\| \| \| \|	llvm-svn: 369124
*	[Hexagon] Fix instruction selection for vselect v4i8	Krzysztof Parzyszek	2019-08-15	1	-0/+9
\| \| \| \|	llvm-svn: 369040
*	[Hexagon] Generate vector min/max for HVX	Krzysztof Parzyszek	2019-08-15	4	-180/+868
\| \| \| \|	llvm-svn: 369014
*	Revert r368339 "[MBP] Disable aggressive loop rotate in plain mode"	Hans Wennborg	2019-08-12	4	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It caused assertions to fire when building Chromium: lib/CodeGen/LiveDebugValues.cpp:331: bool {anonymous}::LiveDebugValues::OpenRangesSet::empty() const: Assertion `Vars.empty() == VarLocs.empty() && "open ranges are inconsistent"' failed. See https://crbug.com/992871#c3 for how to reproduce. > Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. > > To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. > > Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 368579
*	[MBP] Disable aggressive loop rotate in plain mode	Guozhi Wei	2019-08-08	4	-6/+4
\| \| \| \| \| \| \| \| \| \|	Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 368339
*	[Hexagon] Custom-lower UADDO(x, 1) and USUBO(x, 1)	Krzysztof Parzyszek	2019-07-01	1	-0/+37
\| \| \| \|	llvm-svn: 364790
*	[Hexagon] Rework VLCR algorithm	Krzysztof Parzyszek	2019-07-01	1	-0/+82
\| \| \| \| \| \| \| \|	Add code to catch pattern for commutative instructions for VLCR. Patch by Suyog Sarda. llvm-svn: 364770
*	Rename ExpandISelPseudo->FinalizeISel, delay register reservation	Matt Arsenault	2019-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
*	[SCEV] Use NoWrapFlags when expanding a simple mul	Sam Parker	2019-06-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Second functional change following on from rL362687. Pass the NoWrapFlags from the MulExpr to InsertBinop when we're generating a shl or mul. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363540
*	[lit] Delete empty lines at the end of lit.local.cfg NFC	Fangrui Song	2019-06-17	1	-1/+0
\| \| \| \|	llvm-svn: 363538
*	[MBP] Move a latch block with conditional exit and multi predecessors to top ↵	Guozhi Wei	2019-06-14	4	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of loop Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is: * a latch block * it has two successors, one is loop header, another is exit * it has more than one predecessors If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch. Differential Revision: https://reviews.llvm.org/D43256 llvm-svn: 363471
*	Revert "[SCEV] Use wrap flags in InsertBinop"	Benjamin Kramer	2019-06-06	1	-1/+1
\| \| \| \| \| \|	This reverts commit r362687. Miscompiles llvm-profdata during selfhost. llvm-svn: 362699
*	[SCEV] Use wrap flags in InsertBinop	Sam Parker	2019-06-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	If the given SCEVExpr has no (un)signed flags attached to it, transfer these to the resulting instruction or use them to find an existing instruction. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 362687
*	UpdateTestChecks: hexagon support	Roman Lebedev	2019-06-05	3	-14/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These tests are being affected by an upcoming patch, so having an understandable (autogenerated) diff is helpful. This target, again, prefers `-march`: ``` llvm/test/CodeGen/Hexagon$ grep -r triple \| wc -l 467 llvm/test/CodeGen/Hexagon$ grep -r march \| wc -l 1167 ``` Reviewers: RKSimon, kparzysz Reviewed By: kparzysz Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62867 llvm-svn: 362605
*	[NFC] Make tests more robust for new optimizations	David Bolvansky	2019-05-25	4	-8/+8
\| \| \| \|	llvm-svn: 361697
*	[llvm-readobj] Change -long-option to --long-option in tests. NFC	Fangrui Song	2019-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	We use both -long-option and --long-option in tests. Switch to --long-option for consistency. In the "llvm-readelf" mode, -long-option is discouraged as it conflicts with grouped short options and it is not accepted by GNU readelf. While updating the tests, change llvm-readobj -s to llvm-readobj -S to reduce confusion ("s" is --section-headers in llvm-readobj but --symbols in llvm-readelf). llvm-svn: 359649
*	[AsmPrinter] refactor to support %c w/ GlobalAddress'	Nick Desaulniers	2019-04-26	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 llvm-svn: 359337
*	Enable LoopVectorization by default.	Alina Sbirlea	2019-04-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167
*	[DAGCombiner] Combine OR as ADD when no common bits are set	Bjorn Pettersson	2019-04-23	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The DAGCombiner is rewriting (canonicalizing) an ISD::ADD with no common bits set in the operands as an ISD::OR node. This could sometimes result in "missing out" on some combines that normally are performed for ADD. To be more specific this could happen if we already have rewritten an ADD into OR, and later (after legalizations or combines) we expose patterns that could have been optimized if we had seen the OR as an ADD (e.g. reassociations based on ADD). To make the DAG combiner less sensitive to if ADD or OR is used for these "no common bits set" ADD/OR operations we now apply most of the ADD combines also to an OR operation, when value tracking indicates that the operands have no common bits set. Reviewers: spatel, RKSimon, craig.topper, kparzysz Reviewed By: spatel Subscribers: arsenm, rampitec, lebedev.ri, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59758 llvm-svn: 358965
*	[LSR] Limit the recursion for setup cost	David Green	2019-04-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958
*	[AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFC	Nick Desaulniers	2019-04-17	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 llvm-svn: 358603
*	[Hexagon] Fix reuse bug in Vector Loop Carried Reuse pass	Brendon Cahoon	2019-04-12	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Hexagon Vector Loop Carried Reuse pass was allowing reuse between two shufflevectors with different masks. The reason is that the masks are not instruction objects, so the code that checks each operand just skipped over the operands. This patch fixes the bug by checking if the operands are the same when they are not instruction objects. If the objects are not the same, then the code assumes that reuse cannot occur. Differential Revision: https://reviews.llvm.org/D60019 llvm-svn: 358292
*	[Pipeliner] Fix incorrect loop carried dependence calculation	Brendon Cahoon	2019-04-11	3	-0/+174
\| \| \| \| \| \| \| \| \| \| \| \|	The isLoopCarriedDep function does not correctly compute loop carried dependences when the array index offset is negative or the stride is smallar than the access size. Patch by Denis Antrushin. Differential Revision: https://reviews.llvm.org/D60135 llvm-svn: 358233
*	[Hexagon] Remove fcmp undef from reduced tests	Simon Pilgrim	2019-03-29	1	-2/+2
\| \| \| \| \| \| \| \|	Pre-commit for D60006 (Add fcmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @kparzysz (Krzysztof Parzyszek) llvm-svn: 357301
*	Add more rotate tests, including ORs of rotates	Krzysztof Parzyszek	2019-03-21	2	-0/+114
\| \| \| \| \| \|	This is a part of https://reviews.llvm.org/D47735. llvm-svn: 356683
*	[Hexagon] Remove icmp undef from reduced tests	Simon Pilgrim	2019-03-15	10	-18/+18
\| \| \| \| \| \| \| \|	Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @kparzysz (Krzysztof Parzyszek) llvm-svn: 356267
*	[LSR] Attempt to increase the accuracy of LSR's setup cost	David Green	2019-03-07	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some loops, we end up generating loop induction variables that look like: {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1} As opposed to the simpler: {(zext i16 (%i0 * %i1) to i32),+,-1} i.e we count up from -limit to 0, not the simpler counting down from limit to 0. This is because the scores, as LSR calculates them, are the same and the second is filtered in place of the first. We end up with a redundant SUB from 0 in the code. This patch tries to make the calculation of the setup cost a little more thoroughly, recursing into the scev members to better approximate the setup required. The cost function for comparing LSR costs is: return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds, C1.ScaleCost, C1.ImmCost, C1.SetupCost) < std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds, C2.ScaleCost, C2.ImmCost, C2.SetupCost); So this will only alter results if none of the other variables turn out to be different. Differential Revision: https://reviews.llvm.org/D58770 llvm-svn: 355597
*	[Hexagon] Avoid creating 5-instruction packets with vgather pseudos	Krzysztof Parzyszek	2019-03-06	1	-0/+22
\| \| \| \| \| \| \|	Change the resource usage of the vgather pseudos from SLOT0+LD to SLOT0+SLOT1. llvm-svn: 355524