bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[ARM GlobalISel] Support floating point for Thumb2	Diana Picus	2019-02-22	3	-968/+996
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is exactly the same as arm mode, so for the instruction selector tests we just extract them to a new file and run with the same checks for both arm and thumb mode. For the legalizer we need to update the tests for soft float a bit, but only because BL and tBL are slightly different. We could be pedantic and check that we get a well-formed BL for arm mode and a tBL for thumb, but for the purposes of the legalizer test it's sufficient to just skip over the predicate operands in the checks. Also note that we have the pedantic checks in the divmod test, so we're covered. llvm-svn: 354665
*	[ARM GlobalISel] Support G_FRAME_INDEX for Thumb2	Diana Picus	2019-02-21	3	-31/+80
\| \| \| \| \| \|	Same as arm mode. llvm-svn: 354579
*	Revert 354564: [ARM] Add some missing thumb1 opcodes to enable peephole ↵	David Green	2019-02-21	1	-1/+2
\| \| \| \| \| \| \| \| \|	optimisation of CMPs I believe it's causing bootstrap failures for A32 code. I'll take a look at what's wrong. llvm-svn: 354569
*	[ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPs	David Green	2019-02-21	1	-2/+1
\| \| \| \| \| \| \| \| \|	This adds a number of missing Thumb1 opcodes so that the peephole optimiser can remove redundant CMP instructions. Differential Revision: https://reviews.llvm.org/D57833 llvm-svn: 354564
*	[ARM] Negative constants mishandled in ARM CGP	Sam Parker	2019-02-21	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	During type promotion, sometimes we convert negative an add with a negative constant into a sub with a positive constant. The loop that performs this transformation has two issues: - it iterates over a set, causing non-determinism. - it breaks, instead of continuing, when it finds the first non-negative operand. Differential Revision: https://reviews.llvm.org/D58452 llvm-svn: 354557
*	[ARM GlobalISel] Support G_PHI for Thumb2	Diana Picus	2019-02-19	3	-131/+181
\| \| \| \| \| \|	Same as arm mode. llvm-svn: 354310
*	[ARM GlobalISel] Support branches for Thumb2	Diana Picus	2019-02-15	3	-35/+83
\| \| \| \| \| \|	Just like arm mode, but with different opcodes. llvm-svn: 354113
*	[ARM CGP] Fix ConvertTruncs	Sam Parker	2019-02-15	2	-95/+106
\| \| \| \| \| \| \| \| \| \| \| \| \|	ConvertTruncs is used to replace a trunc for an AND mask, however this function wasn't working as expected. By performing the change later, we can create a wide type integer mask instead of a narrow -1 value, which could then be simply removed (incorrectly). Because we now perform this action later, it's necessary to cache the trunc type before we perform the promotion. Differential Revision: https://reviews.llvm.org/D57686 llvm-svn: 354108
*	[ARM GlobalISel] Support G_SELECT for Thumb2	Diana Picus	2019-02-13	3	-55/+137
\| \| \| \| \| \|	Same as arm mode, but slightly different opcodes. llvm-svn: 353938
*	[LegalizeTypes] Expand FNEG to bitwise op for IEEE FP types	Ana Pazos	2019-02-11	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Except for custom floating point types x86_fp80 and ppc_fp128, expand Y = FNEG(X) to Y = X ^ sign mask to avoid library call. Using bitwise operation can improve code size and performance. Reviewers: efriedma Reviewed By: efriedma Subscribers: efriedma, kpn, arsenm, eli.friedman, javed.absar, rbar, johnrusso, simoncook, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D57875 llvm-svn: 353757
*	[ARM] Add v8m.base pattern for add negative imm	Sam Parker	2019-02-11	1	-13/+41
\| \| \| \| \| \| \| \| \| \| \|	The v8m.base ISA contains movw, which can operate on an unsigned 16-bit value. Add the pattern that converts an add with a negative value, that could fit into 16-bits when negated, into a sub with that positive value. Differential Revision: https://reviews.llvm.org/D57942 llvm-svn: 353692
*	[NFC][ARM] Simplify loop-indexing codegen test	Sam Parker	2019-02-11	1	-107/+34
\| \| \| \| \| \| \|	Remove unnecessary offset checks, CHECK-BASE checks and add some extra -NOT checks and TODO comments. llvm-svn: 353689
*	[ARM] LoadStoreOptimizer: reoder limit	Sjoerd Meijer	2019-02-11	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The whole design of generating LDMs/STMs is fragile and unreliable: it depends on rescheduling here in the LoadStoreOptimizer that isn't register pressure aware and regalloc that isn't aware of generating LDMs/STMs. This patch adds a (hidden) option to control the total number of instructions that can be re-ordered. I appreciate this looks only a tiny bit better than a hard-coded constant, but at least it allows more easy experimentation with different values for now. Ideally we calculate this reorder limit based on some heuristics, and take register pressure into account. I might be looking into that next. Differential Revision: https://reviews.llvm.org/D57954 llvm-svn: 353678
*	[DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))	Nemanja Ivanovic	2019-02-08	1	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \|	The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
*	[LSR] Generate cross iteration indexes	Sam Parker	2019-02-07	3	-2/+1502
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modify GenerateConstantOffsetsImpl to create offsets that can be used by indexed addressing modes. If formulae can be generated which result in the constant offset being the same size as the recurrence, we can generate a pre-indexed access. This allows the pointer to be updated via the single pre-indexed access so that (hopefully) no add/subs are required to update it for the next iteration. For small cores, this can significantly improve performance DSP-like loops. Differential Revision: https://reviews.llvm.org/D55373 llvm-svn: 353403
*	[ARM GlobalISel] Support G_ICMP for Thumb2	Diana Picus	2019-02-07	3	-92/+436
\| \| \| \| \| \| \|	Mark as legal and use the t2* equivalents of the arm mode instructions, e.g. t2CMPrr instead of plain CMPrr. llvm-svn: 353392
*	[ARM GlobalISel] Support G_GEP for Thumb2	Diana Picus	2019-02-05	3	-27/+59
\| \| \| \| \| \|	Same as ARM, but use a different opcode in the instruction selection. llvm-svn: 353151
*	GlobalISel: Enforce operand types for constants	Matt Arsenault	2019-02-04	4	-10/+10
\| \| \| \| \| \| \| \|	A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113
*	[CodeGen] Don't scavenge non-saved regs in exception throwing functions	Oliver Stannard	2019-02-01	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, LiveRegUnits was assuming that if a block has no successors and does not return, then no registers are live at the end of it (because the end of the block is unreachable). This was causing the register scavenger to use callee-saved registers to materialise stack frame addresses without saving them in the prologue. This would normally be fine, because the end of the block is unreachable, but this is not legal if the block ends by throwing a C++ exception. If this happens, the scratch register will be modified, but its previous value won't be preserved, so it doesn't get restored by the exception unwinder. Differential revision: https://reviews.llvm.org/D57381 llvm-svn: 352844
*	[ARM] Thumb2: ConstantMaterializationCost	Sjoerd Meijer	2019-01-31	1	-67/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Constants can also be materialised using the negated value and a MVN, and this case seem to have been missed for Thumb2. To check the constant materialisation costs, we now call getT2SOImmVal twice, once for the original constant and then also for its negated value, and this function checks if the constant can both be splatted or rotated. This was revealed by a test that optimises for minsize: instead of a LDR literal pool load and having a literal pool entry, just a MVN with an immediate is smaller (and also faster). Differential Revision: https://reviews.llvm.org/D57327 llvm-svn: 352737
*	[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS	Sjoerd Meijer	2019-01-31	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	And instead just generate a libcall. My motivating example on ARM was a simple: shl i64 %A, %B for which the code bloat is quite significant. For other targets that also accept __int128/i128 such as AArch64 and X86, it is also beneficial for these cases to generate a libcall when optimising for minsize. On these 64-bit targets, the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS lowering operation action is not set to custom/expand. Differential Revision: https://reviews.llvm.org/D57386 llvm-svn: 352736
*	GlobalISel: Fix creating MMOs with align 0	Matt Arsenault	2019-01-31	1	-3/+3
\| \| \| \|	llvm-svn: 352712
*	MIR: Reject non-power-of-4 alignments in MMO parsing	Matt Arsenault	2019-01-30	2	-5/+5
\| \| \| \|	llvm-svn: 352686
*	[ARM] Use sub for negative offset load/store in thumb1	David Green	2019-01-29	1	-36/+24
\| \| \| \| \| \| \| \| \| \| \|	This attempts to optimise negative values used in load/store operands a little. We currently try to selct them as rr, materialising the negative constant using a MOV/MVN pair. This instead selects ri with an immediate of 0, forcing the add node to become a simpler sub. Differential Revision: https://reviews.llvm.org/D57121 llvm-svn: 352475
*	[ARM] Add extra testcases for D57121. NFC	David Green	2019-01-29	1	-0/+482
\| \| \| \|	llvm-svn: 352472
*	[ARM GlobalISel] Support integer division for Thumb2	Diana Picus	2019-01-28	2	-39/+124
\| \| \| \| \| \| \| \| \|	Support G_SDIV, G_UDIV, G_SREM and G_UREM. The only significant difference between arm and thumb mode is that we need to check a different subtarget feature. llvm-svn: 352346
*	[ARM GlobalISel] Support shifts for Thumb2	Diana Picus	2019-01-25	6	-518/+596
\| \| \| \| \| \| \| \| \| \|	Same as ARM. On this occasion we split some of the instruction select tests for more complicated instructions into their own files, so we can reuse them for ARM and Thumb mode. Likewise for the legalizer tests. llvm-svn: 352188
*	[GISel]: Change how CSE is enabled by default for each pass	Aditya Nandakumar	2019-01-24	4	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D57178 Now add a hook in TargetPassConfig to query if CSE needs to be enabled. By default this hook returns false only for O0 opt level but this can be overridden by the target. As a consequence of the default of enabled for non O0, a few tests needed to be updated to not use CSE (by passing in -O0) to the run line. reviewed by: arsenm llvm-svn: 352126
*	[ARM][CGP] Check trunc type before replacing	Sam Parker	2019-01-23	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \|	In the last stage of type promotion, we replace any zext that uses a new trunc with the operand of the trunc. This is okay when we only allowed one type to be optimised, but now its the case that the trunc maybe needed to produce a more narrow type than the one we were optimising for. So we need to check this before doing the replacement. Differential Revision: https://reviews.llvm.org/D57041 llvm-svn: 351935
*	[DAGCombine] Enable more pre-indexed stores	Sam Parker	2019-01-23	1	-0/+186
\| \| \| \| \| \| \| \| \| \| \|	The current check in CombineToPreIndexedLoadStore is too conversative, preventing a pre-indexed store when the base pointer is a predecessor of the value being stored. Instead, we should check the pointer operand of the store. Differential Revision: https://reviews.llvm.org/D56719 llvm-svn: 351933
*	[ARM GlobalISel] Allow calls to varargs functions	Diana Picus	2019-01-17	2	-7/+86
\| \| \| \| \| \| \| \| \|	Allow varargs functions to be called, both in arm and thumb mode. This boils down to choosing the correct calling convention, which we can easily test by making sure arm_aapcscc is used instead of arm_aapcs_vfpcc when the callee is variadic. llvm-svn: 351424
*	[DAGCombine] Fix ReduceLoadWidth for shifted offsets	Sam Parker	2019-01-16	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \|	ReduceLoadWidth can trigger using a shifted mask is used and this requires that the function return a shl node to correct for the offset. However, the way that this was implemented meant that the returned result could be an existing node, which would be incorrect. This fixes the method of inserting the new node and replacing uses. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 351310
*	Remove irrelevant references to legacy git repositories from	James Y Knight	2019-01-15	2	-3/+3
\| \| \| \| \| \| \| \| \|	compiler identification lines in test-cases. (Doing so only because it's then easier to search for references which are actually important and need fixing.) llvm-svn: 351200
*	[ARM GlobalISel] Import MOVi32imm into GlobalISel	Diana Picus	2019-01-14	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it possible for TableGen to produce code for selecting MOVi32imm. This allows reasonably recent ARM targets to select a lot more constants than before. We achieve this by adding GISelPredicateCode to arm_i32imm. It's impossible to use the exact same code for both DAGISel and GlobalISel, since one uses "Subtarget->" and the other "STI." to refer to the subtarget. Moreover, in GlobalISel we don't have ready access to the MachineFunction, so we need to add a bit of code for obtaining it from the instruction that we're selecting. This is also the reason why it needs to remain a PatLeaf instead of the more specific IntImmLeaf. llvm-svn: 351056
*	Replace "no-frame-pointer-*" function attributes with "frame-pointer"	Francis Visoiu Mistrih	2019-01-14	28	-108/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Part of the effort to refactoring frame pointer code generation. We used to use two function attributes "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" to represent three kinds of frame pointer usage: (all) frames use frame pointer, (non-leaf) frames use frame pointer, (none) frame use frame pointer. This CL makes the idea explicit by using only one enum function attribute "frame-pointer" Option "-frame-pointer=" replaces "-disable-fp-elim" for tools such as llc. "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" are still supported for easy migration to "frame-pointer". tests are mostly updated with // replace command line args ‘-disable-fp-elim=false’ with ‘-frame-pointer=none’ grep -iIrnl '\-disable-fp-elim=false' * \| xargs sed -i '' -e "s/-disable-fp-elim=false/-frame-pointer=none/g" // replace command line args ‘-disable-fp-elim’ with ‘-frame-pointer=all’ grep -iIrnl '\-disable-fp-elim' * \| xargs sed -i '' -e "s/-disable-fp-elim/-frame-pointer=all/g" Patch by Yuanfang Chen (tabloid.adroit)! Differential Revision: https://reviews.llvm.org/D56351 llvm-svn: 351049
*	[AArch64] Create feature set for Exynos M4	Evandro Menezes	2019-01-11	1	-1/+1
\| \| \| \| \| \|	Complete the feature set for Exynos M4 and update test cases. llvm-svn: 350953
*	[ARM] Add missing patterns for DSP muls	Sam Parker	2019-01-08	1	-32/+149
\| \| \| \| \| \| \| \| \| \| \|	Using a PatLeaf for sext_16_node allowed matching smulbb and smlabb instructions once the operands had been sign extended. But we also need to use sext_inreg operands along with sext_16_node to catch a few more cases that enable use to remove the unnecessary sxth. Differential Revision: https://reviews.llvm.org/D55992 llvm-svn: 350613
*	[ARM] ComputeKnownBits to handle extract vectors	Diogo N. Sampaio	2019-01-07	1	-23/+101
\| \| \| \| \| \| \| \| \|	This patch adds the sign/zero extension done by vgetlane to ARM computeKnownBitsForTargetNode. Differential revision: https://reviews.llvm.org/D56098 llvm-svn: 350553
*	Regenerate test.	Simon Pilgrim	2019-01-07	1	-13/+44
\| \| \| \| \| \|	Prep work towards enabling SimplifyDemandedBits vector support for TRUNCATE as discussed on D56118. llvm-svn: 350514
*	[ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR.	Florian Hahn	2018-12-21	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes PR35023. Reviewers: MatzeB, t.p.northover, sunfish, qcolombet, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55909 llvm-svn: 349935
*	[ARM GlobalISel] Support G_CONSTANT for Thumb2	Diana Picus	2018-12-19	6	-179/+603
\| \| \| \| \| \| \| \| \| \| \|	All we have to do is mark it as legal. This allows us to select a lot of new patterns handled by TableGen. This patch adds tests for them and splits up the existing test file for binary operators into 2 files, one for arithmetic ops and one for logical ones. llvm-svn: 349610
*	ARM: use acquire/release instruction variants when available.	Tim Northover	2018-12-17	2	-8/+153
\| \| \| \| \| \| \| \|	These features (fairly) recently got split out into their own feature, so we should make CodeGen use them when available. The main change here is that the check used to be based on the triple, but now it's based on CPU features. llvm-svn: 349355
*	[DAGCombiner] allow hoisting vector bitwise logic ahead of truncates	Sanjay Patel	2018-12-16	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The transform performs a bitwise logic op in a wider type followed by truncate when both inputs are truncated from the same source type: logic_op (truncate x), (truncate y) --> truncate (logic_op x, y) There are a bunch of other checks that should prevent doing this when it might be harmful. We already do this transform for scalars in this spot. The vector limitation was shared with a check for the case when the operands are extended. I'm not sure if that limit is needed either, but that would be a separate patch. Differential Revision: https://reviews.llvm.org/D55448 llvm-svn: 349303
*	[ARM] make test immune to scalarization improvements; NFC	Sanjay Patel	2018-12-14	1	-2/+2
\| \| \| \|	llvm-svn: 349177
*	[ARM GlobalISel] Thumb2: casts between int and ptr	Diana Picus	2018-12-14	3	-47/+101
\| \| \| \| \| \|	Mark as legal and add tests. Nothing special to do. llvm-svn: 349147
*	[ARM GlobalISel] Remove duplicate test. NFCI	Diana Picus	2018-12-14	1	-51/+0
\| \| \| \| \| \| \|	Fixup for r349026. I forgot to delete these test functions from the original file when I moved them to arm-legalize-exts.mir. llvm-svn: 349146
*	[ARM GlobalISel] Allow simple binary ops in Thumb2	Diana Picus	2018-12-14	3	-558/+696
\| \| \| \| \| \| \| \| \| \| \|	Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM and Thumb2. Extract the legalizer tests for these opcodes into another file. Add tests for the instruction selector. llvm-svn: 349142
*	[ARM GlobalISel] Support exts and truncs for Thumb2	Diana Picus	2018-12-13	2	-0/+367
\| \| \| \| \| \| \| \| \| \| \|	Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for them in the instruction selector. This uses handwritten code again because the patterns that are generated with TableGen are tuned for what the DAG combiner would produce and not for simple sext/zext nodes. Luckily, we only need to update the opcodes to use the Thumb2 variants, everything else can be reused from ARM. llvm-svn: 349026
*	[CodeGen] Allow mempcy/memset to generate small overlapping stores.	Clement Courbet	2018-12-13	2	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 349016
*	[ARM GlobalISel] Select load/store for Thumb2	Diana Picus	2018-12-12	3	-24/+137
\| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately we can't use TableGen for this because it doesn't yet support predicates on the source pattern root. Therefore, add a bit of handwritten code to the instruction selector to handle the most basic cases. Also mark them as legal and extract their legalizer test cases to a new test file. llvm-svn: 348920