bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix a hidden use of grabbing the Mangler from the AsmPrinter and update	Eric Christopher	2016-09-16	1	-4/+4
\| \| \| \| \| \|	accordingly. llvm-svn: 281748
*	Place the lowered phi instruction(s) before the DEBUG_VALUE entry	Keith Walker	2016-09-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a phi node is finally lowered to a machine instruction it is important that the lowered "load" instruction is placed before the associated DEBUG_VALUE entry describing the value loaded. Renamed the existing SkipPHIsAndLabels to SkipPHIsLabelsAndDebug to more fully describe that it also skips debug entries. Then used the "new" function SkipPHIsAndLabels when the debug information should not be skipped when placing the lowered "load" instructions so that it is placed before the debug entries. Differential Revision: https://reviews.llvm.org/D23760 llvm-svn: 281727
*	Move the Mangler from the AsmPrinter down to TLOF and clean up the	Eric Christopher	2016-09-16	2	-5/+2
\| \| \| \| \| \|	TLOF API accordingly. llvm-svn: 281708
*	Finish renaming remaining analyzeBranch functions	Matt Arsenault	2016-09-14	2	-4/+4
\| \| \| \|	llvm-svn: 281535
*	Make analyzeBranch family of instruction names consistent	Matt Arsenault	2016-09-14	2	-3/+3
\| \| \| \| \| \| \|	analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506
*	AArch64: Use TTI branch functions in branch relaxation	Matt Arsenault	2016-09-14	2	-4/+11
\| \| \| \| \| \| \| \| \|	The main change is to return the code size from InsertBranch/RemoveBranch. Patch mostly by Tim Northover llvm-svn: 281505
*	getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	1	-1/+1
\| \| \| \|	llvm-svn: 281495
*	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	2	-6/+6
\| \| \| \|	llvm-svn: 281493
*	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	1	-5/+3
\| \| \| \|	llvm-svn: 281490
*	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI	Sanjay Patel	2016-09-14	1	-1/+1
\| \| \| \|	llvm-svn: 281489
*	Fix code-gen crash on Power9 for insert_vector_elt with variable index (PR30189)	Nemanja Ivanovic	2016-09-14	2	-2/+16
\| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: https://reviews.llvm.org/D24021 In the initial implementation of this instruction, I forgot to account for variable indices. This patch fixes PR30189 and should probably be merged into 3.9.1 (I'll open a bug according to the new instructions). llvm-svn: 281479
*	Adding missing directive for Power9.	Nemanja Ivanovic	2016-09-14	1	-1/+1
\| \| \| \| \| \| \| \|	There is currently no codegen for Power9 that depends on the directive so this is NFC for now but will be important in the future. This was missed in r268950 so I'm adding it now. llvm-svn: 281473
*	[CodeGen] Split out the notions of MI invariance and MI dereferenceability.	Justin Lebar	2016-09-11	2	-11/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151
*	[PowerPC] Fix address-offset folding for plain addi	Hal Finkel	2016-09-07	1	-15/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When folding an addi into a memory access that can take an immediate offset, we were implicitly assuming that the existing offset was zero. This was incorrect. If we're dealing with an addi with a plain constant, we can add it to the existing offset (assuming that doesn't overflow the immediate, etc.), but if we have anything else (i.e. something that will become a relocation expression), we'll go back to requiring the existing immediate offset to be zero (because we don't know what the requirements on that relocation expression might be - e.g. maybe it is paired with some addis in some relevant way). On the other hand, when dealing with a plain addi with a regular constant immediate, the alignment restrictions (from the TOC base pointer, etc.) are irrelevant. I've added the test case from PR30280, which demonstrated the bug, but also demonstrates a missed optimization opportunity (i.e. we don't need the memory accesses at all). Fixes PR30280. llvm-svn: 280789
*	[PPC] Claim stack frame before storing into it, if no red zone is present	Krzysztof Parzyszek	2016-09-06	1	-25/+91
\| \| \| \| \| \| \| \| \| \| \| \| \|	Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of it there is no guarantee that this part of the stack will not be modified by any interrupt. To avoid this, make sure to claim the stack frame first before storing into it. This fixes https://llvm.org/bugs/show_bug.cgi?id=26519. Differential Revision: https://reviews.llvm.org/D24093 llvm-svn: 280705
*	[PowerPC] During branch relaxation, recompute padding offsets before each ↵	Hal Finkel	2016-09-04	1	-7/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	iteration We used to compute the padding contributions to the block sizes during branch relaxation only at the start of the transformation. As we perform branch relaxation, we change the sizes of the blocks, and so the amount of inter-block padding might change. Accordingly, we need to recompute the (alignment-based) padding in between every iteration on our way toward the fixed point. Unfortunately, I don't have a test case (and none was provided in the bug report), and while this obviously seems needed, algorithmically, I don't have any way of generating a small and/or non-fragile regression test. llvm-svn: 280626
*	[PowerPC] Zero-extend constants in FastISel	Hal Finkel	2016-09-04	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which are illegal types promoted to i32 on PowerPC, is a choice constrained by assumptions within the infrastructure. Specifically, the logic in FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI operands will be zero extended, and so, at least when materializing constants that are PHI operands, we must do the same. The rest of our fast-isel implementation does not appear to depend on the fact that we were sign-extending i8/i16 constants, and all other targets also appear to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we had been doing this only for i1 constants, and sign-extending the others). Fixes PR27721. llvm-svn: 280614
*	[PowerPC] Support asm parsing for bc[l][a][+-] mnemonics	Hal Finkel	2016-09-03	5	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PowerPC assembly code in the wild, so it seems, has things like this: bc+ 12, 28, .L9 This is a bit odd because the '+' here becomes part of the BO field, and the BO field is otherwise the first operand. Nevertheless, the ISA specification does clearly say that the +- hint syntax applies to all conditional-branch mnemonics (that test either CTR or a condition register, although not the forms which check both), both basic and extended, so this is supposed to be valid. This introduces some asm-parser-only definitions which take only the upper three bits from the specified BO value, and the lower two bits are implied by the +- suffix (via some associated aliases). Fixes PR23646. llvm-svn: 280571
*	[PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev	Hal Finkel	2016-09-02	3	-0/+26
\| \| \| \| \| \| \| \|	These few book-III instructions are used by the Linux kernel. Partially fixes PR24796. llvm-svn: 280560
*	[PowerPC] Add support for the extended dcbf form and mnemonics	Hal Finkel	2016-09-02	3	-3/+46
\| \| \| \| \| \| \| \| \|	dcbf has an optional hint-like field, add support for the extended form and the associated mnemonics (dcbfl and dcbflp). Partially fixes PR24796. llvm-svn: 280559
*	[PowerPC] For larger offsets, when possible, fold offset into addis toc@ha	Hal Finkel	2016-09-02	2	-2/+34
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we have an offset into a global, etc. that is accessed relative to the TOC base pointer, and the offset is larger than the minimum alignment of the global itself and the TOC base pointer (which is 8-byte aligned), we can still fold the @toc@ha into the memory access, but we must update the addis instruction's symbol reference with the offset as the symbol addend. When there is only one use of the addi to be folded and only one use of the addis that would need its symbol's offset adjusted, then we can make the adjustment and fold the @toc@l into the memory access. llvm-svn: 280545
*	[PowerPC] hasAndNotCompare should return true	Hal Finkel	2016-09-02	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	As Sanjay suggested when he added the hook, PPC should return true from hasAndNotCompare. We have an efficient negated 'and' on PPC (which can feed a compare). Fixes PR27203. llvm-svn: 280457
*	[PowerPC] Add a pattern for a runtime bit check	Hal Finkel	2016-09-02	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Following a suggestion by Sanjay, we should lower: %shl = shl i32 1, %y %and = and i32 %x, %shl %cmp = icmp eq i32 %and, %shl ret i1 %cmp into: subfic r4, r4, 32 rlwnm r3, r3, r4, 31, 31 Add this pattern and some associated patterns for the 64-bit case and the not-equal case. Fixes PR27356. llvm-svn: 280454
*	[PowerPC] Don't apply the PPC64 address-formation peephole for offsets ↵	Hal Finkel	2016-09-02	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	greater than 7 When applying our address-formation PPC64 peephole, we are reusing the @ha TOC addis value with the low parts associated with different offsets (i.e. different effective symbol addends). We were assuming this was okay so long as the offsets were less than the alignment of the global variable being accessed. This ignored the fact, however, that the TOC base pointer itself need only be 8-byte aligned. As a result, what we were doing is legal only for offsets less than 8 regardless of the alignment of the object being accessed. Fixes PR28727. llvm-svn: 280441
*	[PowerPC] Don't consider fusion in PPC64 address-formation peephole	Hal Finkel	2016-09-02	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	The logic in this function assumes that the P8 supports fusion of addis/addi, but it does not. As a result, there is no advantage to restricting our peephole application, merging addi instructions into dependent memory accesses, even when the addi has multiple users, regardless of whether or not we're optimizing for size. We might need something like this again for the P9; I suspect we'll revisit this code when we work on P9 tuning. llvm-svn: 280440
*	Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC	Hal Finkel	2016-09-01	2	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 llvm-svn: 280350
*	[PowerPC] Don't spill the frame pointer twice	Hal Finkel	2016-08-31	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a function contains something, such as inline asm, which explicitly clobbers the register used as the frame pointer, don't spill it twice. If we need a frame pointer, it will be saved/restored in the prologue/epilogue code. Explicitly spilling it again will reuse the same spill slot used by the prologue/epilogue code, thus clobbering the saved value. The same applies to the base-pointer or PIC-base register. Partially fixes PR26856. Thanks to Ulrich for his analysis and the small inline-asm reproducer. llvm-svn: 280188
*	[PowerPC] Force entry alignment in .got2	Hal Finkel	2016-08-30	1	-2/+4
\| \| \| \| \| \| \| \| \|	Implement Bill's suggested fix for 32-bit targets for PR22711 (for the alignment of each entry). As pointed out in the bug report, we could just force the section alignment, since we only add pointer-sized things currently, but this fix is somewhat more future-proof. llvm-svn: 280049
*	[PowerPC] Add support for -mlongcall	Hal Finkel	2016-08-30	4	-1/+15
\| \| \| \| \| \| \| \| \| \| \|	The "long call" option forces the use of the indirect calling sequence for all calls (even those that don't really need it). GCC provides this option; This is helpful, under certain circumstances, for building very-large binaries, and some other specialized use cases. Fixes PR19098. llvm-svn: 280040
*	[PowerPC] Fix i8/i16 atomics for little-Endian targets without partword atomics	Hal Finkel	2016-08-29	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For little-Endian PowerPC, we generally target only P8 and later by default. However, generic (older) 64-bit configurations are still an option, and in that case, partword atomics are not available (e.g. stbcx.). To lower i8/i16 atomics without true i8/i16 atomic operations, we emulate using i32 atomics in combination with a bunch of shifting and masking, etc. The amount by which to shift in little-Endian mode is different from the amount in big-Endian mode (it is inverted -- meaning we can leave off the xor when computing the amount). Fixes PR22923. llvm-svn: 280022
*	[PowerPC] Implement lowering for atomicrmw min/max/umin/umax	Hal Finkel	2016-08-28	4	-5/+152
\| \| \| \| \| \|	Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818. llvm-svn: 279933
*	MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, ↵	Matthias Braun	2016-08-25	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	compute it Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of running after register and simply describes that no vregs are used in a machine function. With that we can simply compute the property and do not need to dump/parse it in .mir files. Differential Revision: http://reviews.llvm.org/D23850 llvm-svn: 279698
*	[stackmaps] More extraction of common code [NFCI]	Philip Reames	2016-08-23	2	-6/+6
\| \| \| \| \| \|	General cleanup before starting to work on the part I want to actually change. llvm-svn: 279586
*	Reformat.	NAKAMURA Takumi	2016-08-22	1	-20/+20
\| \| \| \|	llvm-svn: 279409
*	Untabify.	NAKAMURA Takumi	2016-08-22	1	-2/+2
\| \| \| \|	llvm-svn: 279408
*	[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround	Michael Kuperstein	2016-08-18	3	-13/+13
\| \| \| \| \| \| \| \| \| \|	The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129
*	Replace "fallthrough" comments with LLVM_FALLTHROUGH	Justin Bogner	2016-08-17	4	-7/+10
\| \| \| \| \| \| \|	This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902
*	[ppc64] Don't apply sibling call optimization if callee has any byval arg	Chuang-Yu Cheng	2016-08-17	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a quick work around, because in some cases, e.g. caller's stack size > callee's stack size, we are still able to apply sibling call optimization even callee has any byval arg. This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328 Reviewers: hfinkel kbarton nemanjai amehsan Subscribers: hans, tjablin https://reviews.llvm.org/D23441 llvm-svn: 278900
*	[x86] Refactor a PowerPC specific ctlz/srl transformation (NFC).	Pierre Gousseau	2016-08-16	2	-13/+7
\| \| \| \| \| \| \| \|	Following the discussion on D22038, this refactors a PowerPC specific setcc -> srl(ctlz) transformation so it can be used by other targets. Differential Revision: https://reviews.llvm.org/D23445 llvm-svn: 278799
*	[PPC] Memoize getValueBits. NFC.	Tim Shen	2016-08-12	1	-35/+49
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: It triggers exponential behavior when the DAG has many branches. Reviewers: hfinkel, kbarton Subscribers: iteratee, nemanjai, echristo Differential Revision: https://reviews.llvm.org/D23428 llvm-svn: 278548
*	Use the range variant of remove_if instead of unpacking begin/end	David Majnemer	2016-08-12	1	-2/+1
\| \| \| \| \| \|	No functionality change is intended. llvm-svn: 278475
*	Use range algorithms instead of unpacking begin/end	David Majnemer	2016-08-11	2	-7/+4
\| \| \| \| \| \|	No functionality change is intended. llvm-svn: 278417
*	[PowerPC] Wrong fast-isel codegen for VSX floating-point loads	Ulrich Weigand	2016-08-05	1	-12/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were two locations where fast-isel would generate a LFD instruction with a target register class VSFRC instead of F8RC when VSX was enabled. This can ccause invalid registers to be used in certain cases, like: lfd 36, ... instead of using a VSX load instruction. The wrong register number gets silently truncated, causing invalid code to be generated. The first place is PPCFastISel::PPCEmitLoad, which had multiple problems: 1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they are computed from resultReg, which is still zero at this point in many cases. Fixed by changing the helper routines to operate on a register class instead of a register and passing in UseRC. 2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo: bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS; bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD; The second line needs to use isVSFRC (like PPCEmitStore does). 3.) Once both the above are fixed, we're now generating a VSX instruction -- but an incorrect one, since generation of an indexed instruction with null index is wrong. Fixed by copying the code handling the same issue in PPCEmitStore. The second place is PPCFastISel::PPCMaterializeFP, where we would emit an LFD to load a constant from the literal pool, and use the wrong result register class. Fixed by hardcoding a F8RC class even on systems supporting VSX. Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630 Differential Revision: https://reviews.llvm.org/D22632 llvm-svn: 277823
*	[PowerPC] fix passing long double arguments to function (soft-float)	Strahinja Petrovic	2016-08-05	3	-0/+39
\| \| \| \| \| \| \| \| \| \|	This patch fixes passing long double type arguments to function in soft float mode. If there is less than 4 argument registers free (long double type is mapped in 4 gpr registers in soft float mode) long double type argument must be passed through stack. Differential Revision: https://reviews.llvm.org/D20114. llvm-svn: 277804
*	[PPC] Handling CallInst in PPCBoolRetToInt	Guozhi Wei	2016-08-03	1	-7/+13
\| \| \| \| \| \| \| \|	This patch fixes pr25548. Current implementation of PPCBoolRetToInt doesn't handle CallInst correctly, so it failed to do the intended optimization when there is a CallInst with parameters. This patch fixed that. llvm-svn: 277655
*	TargetInstrInfo: add virtual function getInstSizeInBytes	Sjoerd Meijer	2016-07-29	1	-1/+1
\| \| \| \| \| \| \| \| \|	This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of subclasses already implement. Differential Revision: https://reviews.llvm.org/D22885 llvm-svn: 277126
*	MachineFunction: Return reference for getFrameInfo(); NFC	Matthias Braun	2016-07-28	4	-129/+128
\| \| \| \| \| \| \|	getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
*	TargetInstrInfo: rename GetInstSizeInBytes to getInstSizeInBytes. NFC	Sjoerd Meijer	2016-07-28	4	-5/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D22925 llvm-svn: 276997
*	[PowerPC] Fix typo in PPCHazardRecognizers.cpp	Nemanja Ivanovic	2016-07-27	1	-1/+1
\| \| \| \| \| \|	Fixes PR28731. llvm-svn: 276865
*	PowerPC: Avoid implicit iterator conversions, NFC	Duncan P. N. Exon Smith	2016-07-27	7	-174/+170
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr* in the PowerPC backend, mainly by preferring MachineInstr& over MachineInstr* when a pointer isn't nullable and using range-based for loops. There was one piece of questionable code in PPCInstrInfo::AnalyzeBranch, where a condition checked a pointer converted from an iterator for nullptr. Since this case is impossible (moreover, the code above guarantees that the iterator is valid), I removed the check when I changed the pointer to a reference. Despite that case, there should be no functionality change here. llvm-svn: 276864