bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	R600/SI: Use unordered not equal instructions	Matt Arsenault	2014-12-11	4	-10/+19
\| \| \| \|	llvm-svn: 224065
*	[CodeGen] Add print and verify pass after each MachineFunctionPass by default	Matthias Braun	2014-12-11	12	-164/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. This is the 2nd attempt at this after realizing that PassManager::add() may actually delete the pass. llvm-svn: 224059
*	This reverts commit r224043 and r224042.	Rafael Espindola	2014-12-11	12	-89/+144
\| \| \| \| \| \|	check-llvm was failing. llvm-svn: 224045
*	Enable machineverifier in debug mode for X86, ARM, AArch64, Mips	Matthias Braun	2014-12-11	4	-20/+20
\| \| \| \|	llvm-svn: 224043
*	[CodeGen] Add print and verify pass after each MachineFunctionPass by default	Matthias Braun	2014-12-11	12	-164/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. llvm-svn: 224042
*	[Hexagon] Renaming classes in preparation for replacement.	Colin LeMahieu	2014-12-11	1	-13/+13
\| \| \| \|	llvm-svn: 224036
*	ARM: convert isTargetIOS checks to isTargetDarwin.	Tim Northover	2014-12-11	4	-12/+8
\| \| \| \| \| \| \| \| \| \| \|	The distinction is mostly useful in the front-end. By the time we get here, there are very few situations where we actually want different behaviour for Darwin and IOS (in fact Darwin mostly just exists in a few tests). So this should reduce any surprising weirdness for anyone using it. No functional change on anything anyone actually cares about. llvm-svn: 224035
*	[PowerPC] Implement BuildSDIVPow2, lower i64 pow2 sdiv using sradi	Hal Finkel	2014-12-11	3	-30/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PPCISelDAGToDAG contained existing code to lower i32 sdiv by a power-of-2 using srawi/addze, but did not implement the i64 case. DAGCombine now contains a callback specifically designed for this purpose (BuildSDIVPow2), and part of the logic has been moved to an implementation of that callback. Doing this lowering using BuildSDIVPow2 likely does not matter, compared to handling everything in PPCISelDAGToDAG, for the positive divisor case, but the negative divisor case, which generates an additional negation, can potentially benefit from additional folding from DAGCombine. Now, both the i32 and the i64 cases have been implemented. Fixes PR20732. llvm-svn: 224033
*	[AVX512] Add support for 512b variable bit shift intrinsics.	Cameron McInally	2014-12-11	3	-39/+43
\| \| \| \|	llvm-svn: 224028
*	[Hexagon] Ading i64 <- i32, i32 sextw pattern.	Colin LeMahieu	2014-12-11	1	-0/+2
\| \| \| \|	llvm-svn: 224027
*	[Hexagon] Adding encoding information for sign extend word instruction.	Colin LeMahieu	2014-12-11	4	-27/+48
\| \| \| \|	llvm-svn: 224026
*	AVX-512: Added all forms of COMPRESS instruction	Elena Demikhovsky	2014-12-11	5	-6/+160
\| \| \| \| \| \|	+ intrinsics + tests llvm-svn: 224019
*	[mips][microMIPS] Implement CodeGen support for LI16 instruction.	Jozef Kolek	2014-12-11	2	-4/+12
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D5840 llvm-svn: 224017
*	[X86] When converting movs to pushes, don't assume MOVmi operand is an ↵	Michael Kuperstein	2014-12-11	1	-11/+11
\| \| \| \| \| \| \| \|	actual immediate This should fix PR21878. llvm-svn: 224010
*	AVX-512: Fixed a bug in lowering setcc for MVT::i1 type	Elena Demikhovsky	2014-12-11	1	-1/+4
\| \| \| \|	llvm-svn: 224008
*	test commit (spelling correction)	Kumar Sukhani	2014-12-11	1	-1/+1
\| \| \| \|	llvm-svn: 224007
*	[X86] Add back AVX2 VR256 PMOVX patterns.	Ahmed Bougacha	2014-12-11	1	-0/+16
\| \| \| \| \| \| \| \| \|	We can't reach those from zext, but other parts of the backend (the shuffle lowering) generate 256-bit VZEXT nodes. Fixes PR21876. llvm-svn: 223996
*	ARM: correctly expand LDR-lit based globals.	Tim Northover	2014-12-10	2	-1/+2
\| \| \| \| \| \| \| \|	Quite a major error here: the expansions for the Pseudos with and without folded load were mixed up. Fortunately it only affects ARM-mode, when not using movw/movt, on Darwin. I'm guessing no-one actually uses that combination. llvm-svn: 223986
*	[Hexagon] Adding combine ri/ir instructions.	Colin LeMahieu	2014-12-10	1	-0/+26
\| \| \| \|	llvm-svn: 223971
*	[Hexagon] Adding encodings for JR class instructions. Updating complier usages.	Colin LeMahieu	2014-12-10	9	-185/+168
\| \| \| \|	llvm-svn: 223967
*	[AArch64] MachO large code-model: Materialize FP constants in code.	Juergen Ributzka	2014-12-10	3	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the large code model we have to first get the address of the GOT entry, load the address of the constant, and then load the constant itself. To avoid these loads and the GOT entry alltogether this commit changes the way how FP constants are materialized in the large code model. The constats are now materialized in a GPR and then bitconverted/moved into the FPR. Reviewed by Tim Northover Fixes rdar://problem/16572564. llvm-svn: 223941
*	R600/SI: Use getTargetConstant in AdjustRegClass	Marek Olsak	2014-12-10	1	-2/+2
\| \| \| \|	llvm-svn: 223940
*	[Hexagon] Adding JR class predicated call reg instructions.	Colin LeMahieu	2014-12-10	2	-0/+36
\| \| \| \|	llvm-svn: 223933
*	Match new shuffle codegen for MOVHPD patterns	Sanjay Patel	2014-12-10	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Add patterns to match SSE (shufpd) and AVX (vpermilpd) shuffle codegen when storing the high element of a v2f64. The existing patterns were only checking for an unpckh type of shuffle. http://llvm.org/bugs/show_bug.cgi?id=21791 Differential Revision: http://reviews.llvm.org/D6586 llvm-svn: 223929
*	[X86] Make a code path in EltsFromConsecutiveLoads work only on vectors it ↵	Michael Kuperstein	2014-12-10	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	expects EltsFromConsecutiveLoads was apparently only ever called for 128-bit vectors, and assumed this implicitly. r223518 started calling it for AVX-sized vectors, causing the code path that had this assumption to crash. This adds a check to make this path fire only for 128-bit vectors. Differential Revision: http://reviews.llvm.org/D6579 llvm-svn: 223922
*	[ARM] Combine base-updating/post-incrementing vector load/stores.	Ahmed Bougacha	2014-12-10	1	-6/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 223862
*	[Hexagon] [NFC] Cleaning up unused classes.	Colin LeMahieu	2014-12-09	1	-30/+0
\| \| \| \|	llvm-svn: 223845
*	[ARM] Factor out base-updating VLD/VST combiner function. NFC.	Ahmed Bougacha	2014-12-09	1	-6/+15
\| \| \| \| \| \| \| \| \|	Move the combiner-state check into another function, add a few small comments, and use a more general type in a cast<>. In preparation for a future patch. llvm-svn: 223834
*	[ARM] Move the store combiner function down. NFC.	Ahmed Bougacha	2014-12-09	1	-141/+143
\| \| \| \| \| \| \|	And flip its final condition. In preparation for a future patch. llvm-svn: 223833
*	[ARM] Also support v2f64 vld1/vst1.	Ahmed Bougacha	2014-12-09	1	-0/+2
\| \| \| \| \| \| \| \| \|	It was missing from the VLD1/VST1 handling logic, even though the corresponding instructions exist (same form as v2i64). In preparation for a future patch. llvm-svn: 223832
*	[Hexagon] Fixing broken tests.	Colin LeMahieu	2014-12-09	1	-1/+2
\| \| \| \|	llvm-svn: 223823
*	[Hexagon] Updating rr/ri 32/64 transfer encodings and adding tests.	Colin LeMahieu	2014-12-09	9	-178/+194
\| \| \| \|	llvm-svn: 223821
*	[FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'.	Juergen Ributzka	2014-12-09	1	-1/+1
\| \| \| \| \| \| \| \| \|	The load/store value type is currently not available when lowering the memcpy intrinsic. Add the missing nullptr check to support this in 'computeAddress'. Fixes rdar://problem/19178947. llvm-svn: 223818
*	[Hexagon] Adding word combine dot-new form and replacing old combine opcode.	Colin LeMahieu	2014-12-09	5	-80/+51
\| \| \| \|	llvm-svn: 223815
*	[AVX512] Added lowering for VBROADCASTSS/SD instructions.	Robert Khasanov	2014-12-09	2	-1/+56
\| \| \| \| \| \| \|	Lowering patterns were written through avx512_broadcast_pat multiclass as pattern generates VBROADCAST and COPY_TO_REGCLASS nodes. Added lowering tests. llvm-svn: 223804
*	IR: Split Metadata from Value	Duncan P. N. Exon Smith	2014-12-09	3	-22/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802
*	[Hexagon] Updating predicate register transfers and adding tstbit to allow ↵	Colin LeMahieu	2014-12-09	5	-27/+74
\| \| \| \| \| \|	select selection. Updating ll tests with predicate transfers that previously had nop encodings. llvm-svn: 223800
*	[PowerPC 4/4] Enable little-endian support for VSX.	Bill Schmidt	2014-12-09	1	-7/+0
\| \| \| \| \| \| \| \|	With the foregoing three patches, VSX instructions can be used for little endian. This patch removes the restriction that prevented this, and re-enables the test cases from the first three patches. llvm-svn: 223792
*	[PowerPC 3/4] Little-endian adjustments for VSX vector shuffle	Bill Schmidt	2014-12-09	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When performing instruction selection for ISD::VECTOR_SHUFFLE, there is special code for handling v2f64 and v2i64 using VSX instructions. This code must be adjusted for little-endian. Because the two inputs are treated as a double-wide register, we must swap their order for little endian. To get the appropriate mask elements to use with the big-endian biased XXPERMDI instruction, we must reverse their order and invert the bits. A new test is added to test the 16 possible values of the shuffle mask. It is initially disabled for reasons specified in the test. It is re-enabled by patch 4/4. llvm-svn: 223791
*	[PowerPC 2/4] Little-endian adjustments for VSX insert/extract operations	Bill Schmidt	2014-12-09	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For little endian, we need to make some straightforward adjustments in the code expansions for scalar_to_vector and vector_extract of v2f64. First, scalar_to_vector must place the scalar into vector element zero. However, our implementation of SUBREG_TO_REG will place it into big-element vector element zero (high-order bits), and for little endian we need it in the low-order bits. The LE implementation splats the high-order doubleword into the low-order doubleword. Second, the meaning of (vector_extract x, 0) and (vector_extract x, 1) must be reversed for similar reasons. A new test is added that tests code generation for insertelement and extractelement for both element 0 and element 1. It is disabled in this patch but enabled in patch 4/4, for reasons stated in the test. llvm-svn: 223788
*	[AVX512] Added VPBROADCAST{BWDQ} (Load with Broadcast Integer Data from ↵	Robert Khasanov	2014-12-09	1	-23/+33
\| \| \| \| \| \| \| \| \|	General Purpose Register) encodings for AVX512-BW/VL subsets Added encoding tests. llvm-svn: 223787
*	[PowerPC 1/4] Little-endian adjustments for VSX loads/stores	Bill Schmidt	2014-12-09	3	-2/+202
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses the inherent big-endian bias in the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. These instructions load vector elements into registers left-to-right (with the first element loaded into the high-order bits of the register), regardless of the endian setting of the processor. However, these are the only vector memory instructions that permit unaligned storage accesses, so we want to use them for little-endian. To make this work, a lxvd2x or lxvw4x is replaced with an lxvd2x followed by an xxswapd, which swaps the doublewords. This works for lxvw4x as well as lxvd2x, because for lxvw4x on an LE system the vector elements are in LE order (right-to-left) within each doubleword. (Thus after lxvw2x of a <4 x float> the elements will appear as 1, 0, 3, 2. Following the swap, they will appear as 3, 2, 0, 1, as desired.) For stores, an stxvd2x or stxvw4x is replaced with an stxvd2x preceded by an xxswapd. Introduction of extra swap instructions provides correctness, but obviously is not ideal from a performance perspective. Future patches will address this with optimizations to remove most of the introduced swaps, which have proven effective in other implementations. The introduction of the swaps is performed during lowering of LOAD, STORE, INTRINSIC_W_CHAIN, and INTRINSIC_VOID operations. The latter are used to translate intrinsics that specify the VSX loads and stores directly into equivalent sequences for little endian. Thus code that uses vec_vsx_ld and vec_vsx_st does not have to be modified to be ported from BE to LE. We introduce new PPCISD opcodes for LXVD2X, STXVD2X, and XXSWAPD for use during this lowering step. In PPCInstrVSX.td, we add new SDType and SDNode definitions for these (PPClxvd2x, PPCstxvd2x, PPCxxswapd). These are recognized during instruction selection and mapped to the correct instructions. Several tests that were written to use -mcpu=pwr7 or pwr8 are modified to disable VSX on LE variants because code generation changes with this and subsequent patches in this set. I chose to include all of these in the first patch than try to rigorously sort out which tests were broken by one or another of the patches. Sorry about that. The new test vsx-ldst-builtin-le.ll, and the changes to vsx-ldst.ll, are disabled until LE support is enabled because of breakages that occur as noted in those tests. They are re-enabled in patch 4/4. llvm-svn: 223783
*	[x86] Fix the test to actually test things for the CPU names, add the	Chandler Carruth	2014-12-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	missing barcelona CPU which that test uncovered, and remove the 32-bit x86 CPUs which I really wasn't prepared to audit and test thoroughly. If anyone wants to clean up the 32-bit only x86 CPUs, go for it. Also, if anyone else wants to try to de-duplicate the AMD CPUs, that'd be cool, but from the looks of it wouldn't save as much as it did for the Intel CPUs. llvm-svn: 223774
*	Removing an unused variable to silence a -Wunused-but-set-variable warning. NFC.	Aaron Ballman	2014-12-09	1	-2/+0
\| \| \| \|	llvm-svn: 223773
*	Fix modified immediate bug reported by MC Hammer.	Asiri Rathnayake	2014-12-09	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Instructions of the form [ADD Rd, pc, #imm] are manually aliased in processInstruction() to use ADR. To accomodate this, mod_imm handling had to be tweaked a bit. Turns out it was the manual aliasing that must be tweaked to accommodate mod_imms instead. More information about the parsed instruction is available at the point where processInstruction() is invoked, which makes it easier to detect a mod_imm at that point rather than trying to detect a potential alias when a mod_imm is being prepped. Added a test case and fixed some white spaces as well. llvm-svn: 223772
*	[x86] Bring some sanity to the x86 CPU processor definitions.	Chandler Carruth	2014-12-09	1	-61/+139
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Notably, this adds simple micro-architecture names for the Intel CPU variants, and defines the old 'core'-based names as aliases. GCC has started to simplify their documented interface to use these names as well, so it seems like we can start to converge on a consistent pattern. I'd appreciate Intel double checking the entries that aren't yet documented widely, especially Atom (Bonnell and Silvermont), Knights Landing, and Skylake. But this change shouldn't break any existing users. Also, ran clang-format to re-format this code and it actually worked (modulo a tiny bug) so hopefully we can start to stop thinking about formatting this stuff. llvm-svn: 223769
*	AVX-512: Added some comments to ERI scalar intrinsics.	Elena Demikhovsky	2014-12-09	2	-6/+17
\| \| \| \| \| \|	No functional change. llvm-svn: 223761
*	test commit (spelling correction)	Mohit K. Bhakkad	2014-12-09	1	-1/+1
\| \| \| \|	llvm-svn: 223758
*	[X86] Convert esp-relative movs of function arguments into pushes, step 1	Michael Kuperstein	2014-12-09	2	-4/+125
\| \| \| \| \| \| \| \| \| \| \|	This handles the simplest case for mov -> push conversion: 1. x86-32 calling convention, everything is passed through the stack. 2. There is no reserved call frame. 3. Only registers or immediates are pushed, no attempt to combine a mem-reg-mem sequence into a single PUSHmm. Differential Revision: http://reviews.llvm.org/D6503 llvm-svn: 223757
*	Restore r223709 as it was meant to be, and enable FeatureP8Vector for P8	Bill Schmidt	2014-12-09	1	-2/+2
\| \| \| \|	llvm-svn: 223751