bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[ARM] Use preferred alignment for constants in promoteToConstantPool.	Eli Friedman	2018-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This mostly affects IR generated by non-clang frontends because clang generally sets the alignment of globals explicitly. Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 . (-arm-promote-constant is currently off by default, and it stays off with this patch. I'll look into turning it on again when all the known issues are fixed.) Differential Revision: https://reviews.llvm.org/D51469 llvm-svn: 343359
*	[ARM] Share predecessor bookkeeping in CombineBaseUpdate. NFCI.	Nirav Dave	2018-09-25	1	-2/+9
\| \| \| \|	llvm-svn: 342987
*	[AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IR	Alex Bradbury	2018-09-19	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they will see no functional change. Previously this hook returned bool, but it now returns AtomicExpansionKind. This hook allows targets to select how a given cmpxchg is to be expanded. D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic. See my associated RFC for more info on the motivation for this change <http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>. Differential Revision: https://reviews.llvm.org/D48130 llvm-svn: 342550
*	ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.	Tim Northover	2018-09-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \|	The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127
*	[MinGW] Move code for indicating "potentially not DSO local" into ↵	Martin Storsjo	2018-09-04	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shouldAssumeDSOLocal. NFC. On Windows, if shouldAssumeDSOLocal returns false, it's either a dllimport reference, or a reference that we should treat as non-local and create a stub for. Clean up AArch64Subtarget::ClassifyGlobalReference a little while touching the flag handling relating to dllimport. Differential Revision: https://reviews.llvm.org/D51590 llvm-svn: 341402
*	[MinGW] [ARM] Add stubs for potential automatic dllimported variables	Martin Storsjo	2018-08-31	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \|	The runtime pseudo relocations can't handle the ARM format embedded addresses in movw/movt pairs. By using stubs, the potentially dllimported addresses can be touched up by the runtime pseudo relocation framework. Differential Revision: https://reviews.llvm.org/D51450 llvm-svn: 341176
*	[ARM] Lower llvm.ctlz.i32 to a libcall when clz is not available.	Eli Friedman	2018-08-22	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	The inline sequence is very long (about 70 bytes on Thumb1), so it's not really a good idea to inline it, especially when optimizing for size. Differential Revision: https://reviews.llvm.org/D47917 llvm-svn: 340458
*	[ARM] Handle all-ones mask explicitly in targetShrinkDemandedConstant.	Eli Friedman	2018-08-22	1	-4/+11
\| \| \| \| \| \| \| \| \| \| \|	This avoids a potential infinite loop setting and unsetting bits in the mask. Reduced from a failure on the polly-aosp bot. Differential Revision: https://reviews.llvm.org/D51066 llvm-svn: 340446
*	[AArch64] Add Tiny Code Model for AArch64	David Green	2018-08-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the plumbing for the Tiny code model for the AArch64 backend. This, instead of loading addresses through the normal ADRP;ADD pair used in the Small model, uses a single ADR. The 21 bit range of an ADR means that the code and its statically defined symbols need to be within 1MB of each other. This makes it mostly interesting for embedded applications where we want to fit as much as we can in as small a space as possible. Differential Revision: https://reviews.llvm.org/D49673 llvm-svn: 340397
*	[SDAG] Remove the reliance on MI's allocation strategy for	Chandler Carruth	2018-08-14	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be surprised at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740
*	[ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly.	Eli Friedman	2018-08-14	1	-3/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intentionally excluding nodes from the DAGCombine worklist is likely to lead to weird optimizations and infinite loops, so it's generally a bad idea. To avoid the infinite loops, fix DAGCombine to use the isDesirableToCommuteWithShift target hook before performing the transforms in question, and implement the target hook in the ARM backend disable the transforms in question. Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a reduced testcase for that bug. But we should have sufficient test coverage for PerformSHLSimplify given that we're not playing weird tricks with the worklist. I can try to bugpoint it if necessary, though.) Differential Revision: https://reviews.llvm.org/D50667 llvm-svn: 339734
*	Fix unused lambda capture warning from r339472.	Eli Friedman	2018-08-10	1	-1/+1
\| \| \| \|	llvm-svn: 339479
*	[ARM] Adjust AND immediates to make them cheaper to select.	Eli Friedman	2018-08-10	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM normally prefers to minimize the number of bits set in an AND immediate, but that doesn't always match the available ARM instructions. In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer a two-instruction sequence movs+ands or movs+bics. Some potential improvements outlined in ARMTargetLowering::targetShrinkDemandedConstant, but seems to work pretty well already. The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX instruction due to a larger-than-expected mask. (It's orthogonal, in some sense, but as far as I can tell it's either impossible or nearly impossible to reproduce the bug without this change.) According to my testing, this seems to consistently improve codesize by a small amount by forming bic more often for ISD::AND with an immediate. Differential Revision: https://reviews.llvm.org/D50030 llvm-svn: 339472
*	[ARM] FP16: support vector INT_TO_FP and FP_TO_INT	Sjoerd Meijer	2018-08-08	1	-7/+35
\| \| \| \| \| \| \| \|	This adds codegen support for the different vcvt_f16 variants. Differential Revision: https://reviews.llvm.org/D50393 llvm-svn: 339227
*	[ARM] FP16: support the vector vmin and vmax variants	Sjoerd Meijer	2018-08-08	1	-0/+12
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D50238 llvm-svn: 339221
*	Remove trailing space	Fangrui Song	2018-07-30	1	-5/+5
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	[ARM] Prefer lsls+lsrs over lsls+ands or lsrs+ands in Thumb1.	Eli Friedman	2018-07-25	1	-0/+81
\| \| \| \| \| \| \| \| \| \| \| \| \|	Saves materializing the immediate for the "ands". Corresponding patterns exist for lsrs+lsls, but that seems less common in practice. Now implemented as a DAGCombine. Differential Revision: https://reviews.llvm.org/D49585 llvm-svn: 337945
*	ARM: stop explicitly marking armv7k libcalls as hard-float. NFC.	Tim Northover	2018-07-18	1	-7/+0
\| \| \| \| \| \| \|	Since the triple's default is hard float, the libcalls will already use VFP registers. llvm-svn: 337386
*	[ARM] Treat cmn immediates as legal in isLegalICmpImmediate.	Eli Friedman	2018-07-10	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	The original code attempted to do this, but the std::abs() call didn't actually do anything due to implicit type conversions. Fix the type conversions, and perform the correct check for negative immediates. This probably has very little practical impact, but it's worth fixing just to avoid confusion in the future, I think. Differential Revision: https://reviews.llvm.org/D48907 llvm-svn: 336742
*	[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses	Ivan A. Kosarev	2018-07-05	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
*	[ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m.	Vadzim Dambrouski	2018-07-02	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: efriedma, rogfer01, javed.absar Reviewed By: efriedma, rogfer01 Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D48846 llvm-svn: 336144
*	[NEON] Support vldNq intrinsics in AArch32 (LLVM part)	Ivan A. Kosarev	2018-06-27	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the q versions of the dup (load-to-all-lanes) NEON intrinsics, such as vld2q_dup_f16() for example. Currently, non-q versions of the dup intrinsics are implemented in clang by generating IR that first loads the elements of the structure into the first lane with the lane (to-single-lane) intrinsics, and then propagating it other lanes. There are at least two problems with this approach. First, there are no double-spaced to-single-lane byte-element instructions. For example, there is no such instruction as 'vld2.8 { d0[0], d2[0] }, [r0]'. That means we cannot rely on the to-single-lane intrinsics and instructions to implement the q versions of the dup intrinsics. Note that to-all-lanes instructions do support all sizes of data items, including bytes. The second problem with the current approach is that we need a separate vdup instruction to propagate the structure to each lane. So for vld4q_dup_f16() we would need four vdup instructions in addition to the initial vld instruction. This patch introduces dup LLVM intrinsics and reworks handling of the currently supported (non-q) NEON dup intrinsics to expand them into those LLVM intrinsics, thus eliminating the need for using to-single-lane intrinsics and instructions. Additionally, this patch adds support for u64 and s64 dup NEON intrinsics. These are marked as Arch64-only in the ARM NEON Reference, but it seems there are no reasons to not support them in AArch32 mode. Please correct, if that is wrong. That's what we generate with this patch applied: vld2q_dup_f16: vld2.16 {d0[], d2[]}, [r0] vld2.16 {d1[], d3[]}, [r0] vld3q_dup_f16: vld3.16 {d0[], d2[], d4[]}, [r0] vld3.16 {d1[], d3[], d5[]}, [r0] vld4q_dup_f16: vld4.16 {d0[], d2[], d4[], d6[]}, [r0] vld4.16 {d1[], d3[], d5[], d7[]}, [r0] Differential Revision: https://reviews.llvm.org/D48439 llvm-svn: 335733
*	[NEON] Support VST1xN intrinsics in AArch32 mode (LLVM part)	Ivan A. Kosarev	2018-06-10	1	-0/+24
\| \| \| \| \| \| \| \| \|	We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47447 llvm-svn: 334361
*	[NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part)	Ivan A. Kosarev	2018-06-02	1	-0/+18
\| \| \| \| \| \| \| \| \|	We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47120 llvm-svn: 333825
*	Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)"	Ivan A. Kosarev	2018-06-02	1	-18/+0
\| \| \| \| \| \| \| \|	The LLVM part was committed instead of the Clang part. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333824
*	[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)	Ivan A. Kosarev	2018-06-02	1	-0/+18
\| \| \| \| \| \| \| \| \|	We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333819
*	[ARM] Remove code handling ADDC/ADDE/SUBC/SUBE	Amaury Sechet	2018-05-30	1	-30/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This code is now dead as the ARM backend uses ADDCARRY/SUBCARRY/SETCCCARRY . Reviewers: rogfer01, efriedma, rengolin, javed.absar Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D47413 llvm-svn: 333544
*	[ARM] Enable SETCCCARRY lowering for Thumb1.	Eli Friedman	2018-05-29	1	-3/+1
\| \| \| \| \| \| \| \| \|	We've had Thumb1 support for ARMISD::SUBE for a while now, so this just works. Reduces codesize a bit for 64-bit integer comparisons. Differential Revision: https://reviews.llvm.org/D47387 llvm-svn: 333445
*	ARM: be conservative when asked load/store alignment of weird type.	Tim Northover	2018-05-21	1	-0/+4
\| \| \| \| \| \| \|	Chances are we'll be asked again after type legalization, but before that point it's better to claim misaligned accesses aren't allowed than to assert. llvm-svn: 332840
*	Rename DEBUG macro to LLVM_DEBUG.	Nicola Zaghen	2018-05-14	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
*	[ARM] Add support for SETCCCARRY instead of SETCCE	Amaury Sechet	2018-05-09	1	-5/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: As per title. SETCCE is deprecated and will eventually be removed. Reviewers: rogfer01, efriedma, rengolin, javed.absar Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D46512 llvm-svn: 331929
*	[ARM] Select result 1 from ConvertBooleanCarryToCarryFlag's result ↵	Amaury Sechet	2018-05-07	1	-4/+6
\| \| \| \| \| \| \| \|	automatically. NFC The old behavior return the value 0, which is error prone. llvm-svn: 331614
*	ARM: don't try to over-align large vectors as arguments.	Tim Northover	2018-05-03	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	By default LLVM thinks very large vectors get aligned to their size when passed across functions. Unfortunately no-one told the ARM backend so it doesn't trigger stack realignment and so accesses can cause the usual misalignment issues (e.g. a data abort). This changes the ABI alignment to the stack alignment, which in practice (and as a bonus) also coincides with the alignment "natural" vectors get. llvm-svn: 331451
*	Remove \brief commands from doxygen comments.	Adrian Prantl	2018-05-01	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
*	[ARM] FP16 vmaxnm/vminnm scalar instructions	Sjoerd Meijer	2018-04-13	1	-0/+5
\| \| \| \| \| \| \| \| \|	This adds code generation support for the FP16 vmaxnm/vminnm scalar instructions. Differential Revision: https://reviews.llvm.org/D44675 llvm-svn: 330034
*	[ARM] FP16 VSEL codegen	Sjoerd Meijer	2018-04-11	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT. More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values. Differential Revision: https://reviews.llvm.org/D45205 llvm-svn: 329788
*	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to ↵	Craig Topper	2018-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806
*	[ARM] Support float literals under XO	Christof Douma	2018-03-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691
*	Fix layering by moving ValueTypes.h from CodeGen to IR	David Blaikie	2018-03-23	1	-1/+1
\| \| \| \| \| \|	ValueTypes.h is implemented in IR already. llvm-svn: 328397
*	Fix layering of MachineValueType.h by moving it from CodeGen to Support	David Blaikie	2018-03-23	1	-1/+1
\| \| \| \| \| \| \| \| \|	This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395
*	[ARM] Support float literals under XO	Christof Douma	2018-03-23	1	-12/+25
\| \| \| \| \| \| \| \| \| \| \| \| \|	When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313
*	[ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes	Martin Storsjo	2018-03-19	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	This extends the use of this attribute on ARM and AArch64 from SVN r325900 (where it was only checked for fixed stack allocations on ARM/AArch64, but for all stack allocations on X86). This also adds a testcase for the existing use of disabling the fixed stack probe with the attribute on ARM and AArch64. Differential Revision: https://reviews.llvm.org/D44291 llvm-svn: 327897
*	[ARM] Support for v4f16 and v8f16 vectors	Sjoerd Meijer	2018-03-19	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \|	This is the groundwork for adding the Armv8.2-A FP16 vector intrinsics, which uses v4f16 and v8f16 vector operands and return values. All the moving parts are tested with two intrinsics, a 1-operand v8f16 and a 2-operand v4f16 intrinsic. In a follow-up patch the rest of the intrinsics and tests will be added. Differential Revision: https://reviews.llvm.org/D44538 llvm-svn: 327839
*	[ARM] FP16 codegen support for VSEL	Sjoerd Meijer	2018-03-16	1	-0/+1
\| \| \| \| \| \| \| \| \|	This implements lowering of SELECT_CC for f16s, which enables codegen of VSEL with f16 types. Differential Revision: https://reviews.llvm.org/D44518 llvm-svn: 327695
*	[ARM] Fix for PR36577	Sjoerd Meijer	2018-03-07	1	-8/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Don't PerformSHLSimplify if the given node is used by a node that also uses a constant because we may get stuck in an infinite combine loop. bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36577 Patch by Sam Parker. Differential Revision: https://reviews.llvm.org/D44097 llvm-svn: 326882
*	[TLS] use emulated TLS if the target supports only this mode	Chih-Hung Hsieh	2018-02-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341
*	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations	Pablo Barrio	2018-02-28	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Martin Svanfeldt Reviewers: fhahn, pbarrio, rogfer01 Reviewed By: pbarrio, rogfer01 Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 326333
*	[MachineOperand][Target] MachineOperand::isRenamable semantics changes	Geoff Berry	2018-02-23	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931
*	[ARM] Lower BR_CC for f16	Sjoerd Meijer	2018-02-20	1	-2/+1
\| \| \| \| \| \| \| \|	This case wasn't handled yet. Differential Revision: https://reviews.llvm.org/D43508 llvm-svn: 325616
*	[ARM] Materialise some boolean values to avoid a branch	Roger Ferrer Ibanez	2018-02-16	1	-10/+89
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form a != b ? x : 0 a == b ? 0 : x and that currently (e.g. in Thumb1) are emitted as branches. Differential Revision: https://reviews.llvm.org/D34515 llvm-svn: 325323