bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[x86][AVX512] add Multiply High Op	Asaf Badouh	2015-07-05	4	-0/+297
\| \| \| \| \| \| \| \| \|	include encoding and intrinsics tests. review http://reviews.llvm.org/D10896 llvm-svn: 241406
*	[X86] Fix incorrect/inefficient pushw encodings for x86-64 targets	Michael Kuperstein	2015-07-05	3	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	Correctly support assembling "pushw $imm8" on x86-64 targets. Also some cleanup of the PUSH instructions (PUSH64i16 and PUSHi16 actually represent the same instruction) This fixes PR23996 Patch by: david.l.kreitzer@intel.com Differential Revision: http://reviews.llvm.org/D10878 llvm-svn: 241404
*	Add missing builtins to the PPC back end for ABI compliance (vol. 2)	Nemanja Ivanovic	2015-07-05	1	-0/+31
\| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D10874 Back end portion of the second round of additions to altivec.h. llvm-svn: 241398
*	[X86][SSE] Improved i8/i16 to f64 uint2fp vector conversions	Simon Pilgrim	2015-07-04	1	-110/+28
\| \| \| \| \| \|	Followup to D10433 and D10589 that fixes i8/i16 uint2fp vector conversions by zero extending to i32 and using the sint2fp path (unless the target does actually support uint2fp). llvm-svn: 241394
*	[RuntimeDyld] Skip relocations for external symbols with 64-bit address ~0ULL.	Lang Hames	2015-07-04	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \|	Requested by Eugene Rozenfeld of the LLILC team, this feature allows JIT clients to skip relocations for selected external symbols by returning ~0ULL from their symbol resolver. If this value is returned for a given symbol, RuntimeDyld will skip all relocations for that symbol. The client will be responsible for applying the skipped relocations manually before the code is executed. llvm-svn: 241383
*	[X86] Add proper 64-bit mode checks to jrcxz and jcxz.	Craig Topper	2015-07-04	1	-0/+6
\| \| \| \|	llvm-svn: 241381
*	[ELFYAML] Fix handling SHT_NOBITS sections by obj2yaml/yaml2obj tools	Simon Atanasyan	2015-07-03	2	-3/+2
\| \| \| \| \| \| \| \| \| \|	SHT_NOBITS sections do not have content in an object file. Now the yaml2obj tool does not accept `Content` field for such sections, and the obj2yaml tool does not attempt to read the section content from a file. Restore r241350 and r241352. llvm-svn: 241377
*	[X86] Added 32-bit builds to fp<->int tests.	Simon Pilgrim	2015-07-03	2	-3/+13
\| \| \| \| \| \|	Ensure that i686 x87/SSE/SSE2 targets all build. llvm-svn: 241368
*	This reverts commit r241350 and r241352.	Rafael Espindola	2015-07-03	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	r241350 broke lld tests. r241352 depends on r241350. Original messages: "[ELFYAML] Fix handling SHT_NOBITS sections by obj2yaml/yaml2obj tools" "[ELFYAML] Make the Size field for .bss section optional" llvm-svn: 241354
*	[ELFYAML] Make the Size field for .bss section optional	Simon Atanasyan	2015-07-03	1	-1/+0
\| \| \| \| \| \|	It's a common case to have a zero-size .bss section in an object file. llvm-svn: 241352
*	[ELFYAML] Fix handling SHT_NOBITS sections by obj2yaml/yaml2obj tools	Simon Atanasyan	2015-07-03	2	-3/+3
\| \| \| \| \| \| \| \|	SHT_NOBITS sections do not have content in an object file. Now yaml2obj tool does not accept `Content` field for such sections, and obj2yaml tool does not attempt to read the section content from a file. llvm-svn: 241350
*	llvm/test/CodeGen/ARM/fnattr-trap.ll: Add -mtriple, to appease targeting ↵	NAKAMURA Takumi	2015-07-03	1	-2/+2
\| \| \| \| \| \| \| \|	*-win32. LLVM ERROR: CPU: 'generic' does not support ARM mode execution! llvm-svn: 241329
*	whitespace tidyup. NFC.	Simon Pilgrim	2015-07-03	1	-28/+28
\| \| \| \|	llvm-svn: 241326
*	[X86][SSE] Sign extension for target vector sizes less than 128 bits (pt2)	Simon Pilgrim	2015-07-03	2	-42/+10
\| \| \| \| \| \| \| \|	Add support for v2i8/v2i16 to v2f64 by using a sign extension to v2i32 before conversion to v2f64. Differential Revision: http://reviews.llvm.org/D10589 llvm-svn: 241325
*	[X86][SSE] Sign extension for target vector sizes less than 128 bits (pt1)	Simon Pilgrim	2015-07-03	1	-0/+40
\| \| \| \| \| \| \| \|	This patch adds support for sign extension for sub 128-bit vectors, such as to v2i32. It concatenates with UNDEF subvectors up to 128-bits, performs the sign extension (i.e. as v4i32) and then extracts the target subvector. Patch 1/2 of D10589 - the second patch covers the conversion of v2i8/v2i16 to v2f64. llvm-svn: 241323
*	Fix an overly aggressive assertion in getCopyFromPartsVector.	Nadav Rotem	2015-07-02	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \|	The assertion in getCopyFromPartsVector assumed that the vector 'part' must match the type of argument (arguments are potentially split into multiple parts). However, in some cases the targets return a 'part' of the right size but with a different type. We already handle this case correctly later on and generate a bitcast. This commit just makes sure that we are actually checking the property that we care about. llvm-svn: 241312
*	Use function attribute "trap-func-name" and remove TargetOptions::TrapFuncName.	Akira Hatanaka	2015-07-02	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit changes normal isel and fast isel to read the user-defined trap function name from function attribute "trap-func-name" attached to llvm.trap or llvm.debugtrap instead of from TargetOptions::TrapFuncName. This is needed to use clang's command line option "-ftrap-function" for LTO and enable changing the trap function name on a per-call-site basis. Out-of-tree projects currently using TargetOptions::TrapFuncName to specify the trap function name should attach attribute "trap-func-name" to the call sites of llvm.trap and llvm.debugtrap instead. rdar://problem/21225723 Differential Revision: http://reviews.llvm.org/D10832 llvm-svn: 241305
*	[PPC64LE] Remove implicit-subreg restriction from VSX swap removal	Bill Schmidt	2015-07-02	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In r241285, I removed the SUBREG_TO_REG restriction from VSX swap removal, determining that this was overly conservative. We have another form of the same restriction in that we check for the presence of implicit subregs in vector operations. As with SUBREG_TO_REG for partial register conversions, an implicit subreg is safe in and of itself, provided no other operation makes a lane-sensitive assumption about the result. This patch removes that restriction, by removing the HasImplicitSubreg flag and all code that relies on it. I've added a test case that fails to optimize before this patch is applied, and optimizes properly with the patch. Test based on a report from Anton Blanchard. llvm-svn: 241290
*	[PPC64LE] Teach swap optimization about the doubleword splat idiom	Bill Schmidt	2015-07-02	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With a previous patch, the VSX swap optimization is able to recognize the doubleword load-splat idiom that can be implemented using lxvdsx. However, that does not cover a doubleword splat where the source is a register. We can implement this using xxspltd (a special form of xxpermdi). This patch teaches the swap optimization pass about this idiom. As a prerequisite, it also permits swap optimization to succeed for all forms of SUBREG_TO_REG. Previously we were conservative and only allowed SUBREG_TO_REG when it copied a full register. However, on reflection any form of SUBREG_TO_REG is safe in and of itself, so long as an unsafe operation is not performed on its result. In particular, a widening SUBREG_TO_REG often occurs as an input to a doubleword splat idiom, particularly in auto-vectorized code. The doubleword splat idiom is an XXPERMDI operation where both source registers are identical, and the selection mask is either 0 (splat the first element) or 3 (splat the second element). To determine whether the registers are identical, we use the existing mechanism for looking through "copy-like" operations. That mechanism has a side effect of marking the XXPERMDI operation as using a physical register, which would invalidate its presence in a swap-optimized region. This is correct for the form of XXPERMDI that performs a swap and hence would be removed, but is not what we want for a doubleword-splat variety of XXPERMDI. Therefore we reset the physical-register flag on the XXPERMDI when it represents a splat. A simple test case is added to verify that we generate the splat and that we also remove the xxswapd instructions that would otherwise be associated with the load and store of another operand. llvm-svn: 241285
*	Reworking the test part of r241149	Gabor Ballabas	2015-07-02	4	-0/+33
\| \| \| \| \| \| \| \| \|	The test part of r241149 has been reverted in r241451, due to misplaced test cases. This patch splits those test cases among the appropriate targets. Differential Revision: http://reviews.llvm.org/D10897 llvm-svn: 241283
*	Fix for PR23310: llvm-dis crashes when trying to upgrade an intrinsic.	Rafael Espindola	2015-07-02	2	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When trying to upgrade @llvm.x86.sse2.psrl.dq while parsing a module, BitcodeReader adds the function to its worklist twice, resulting in a crash when accessing it the second time. This patch replaces the worklist vector by a map. Patch by Philip Pfaffe. llvm-svn: 241281
*	[X86] Convert an instruction relaxation test to use objdump instead of readobj	Michael Kuperstein	2015-07-02	1	-58/+64
\| \| \| \| \| \|	Patch by: david.l.kreitzer@intel.com llvm-svn: 241270
*	Improve error message.	Rafael Espindola	2015-07-02	1	-1/+1
\| \| \| \| \| \|	Thanks to Sean Silva for the suggestion. llvm-svn: 241255
*	Reapply r240291: Fix shl folding in DAG combiner.	Pawel Bylica	2015-07-02	1	-0/+9
\| \| \| \| \| \| \| \|	The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared. It has been reverted previously because of some problems with comparing APInt with raw uint64_t. That has been fixed/changed with r241204. llvm-svn: 241254
*	[LazyCallGraph] Port test case from r240039 to LCG.	Sanjoy Das	2015-07-02	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: r240039 adds a test case to check that CallGraph does the right thing with respect to non-leaf intrinsics like statepoint and patchpoint. This ports the same test case to LazyCallGraph. LazyCallGraph already does the right thing with respect to escaping function pointers so there is no need to change any code. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10582 llvm-svn: 241226
*	Make an X86 specific directory and put the recent X86 tti specific	Eric Christopher	2015-07-02	2	-1/+4
\| \| \| \| \| \|	inlining test into it. llvm-svn: 241223
*	Implement TargetTransformInfo::hasCompatibleFunctionAttributes for X86.	Eric Christopher	2015-07-02	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This checks subtarget feature compatibility for inlining by verifying that the callee is a strict subset of the caller's features. This includes the cpu as part of the subtarget we can get via the incoming functions as the backend takes CPUs as feature sets. This allows us to inline things like: int foo() { return baz(); } int __attribute__((target("sse4.2"))) bar() { return foo(); } so that generic code can be inlined into specialized functions. llvm-svn: 241221
*	[TwoAddressInstructionPass] Try 3 Addr Conversion After Commuting.	Quentin Colombet	2015-07-01	4	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TwoAddressInstructionPass stops after a successful commuting but 3 Addr conversion might be good for some cases. Consider: int foo(int a, int b) { return a + b; } Before this commit, we emit: addl %esi, %edi movl %edi, %eax ret After this commit, we try 3 Addr conversion: leal (%rsi,%rdi), %eax ret Patch by Volkan Keles <vkeles@apple.com>! Differential Revision: http://reviews.llvm.org/D10851 llvm-svn: 241206
*	Test for specific output in lit test	Matthias Braun	2015-07-01	1	-1/+18
\| \| \| \|	llvm-svn: 241200
*	[LoopVectorize] Use ReplaceInstWithInst() helper where appropriate.	Alexey Samsonov	2015-07-01	1	-15/+30
\| \| \| \| \| \| \| \| \| \|	This is mostly an NFC, which increases code readability (instead of saving old terminator, generating new one in front of old, and deleting old, we just call a function). However, it would additionaly copy the debug location from old instruction to replacement, which would help PR23837. llvm-svn: 241197
*	[NVPTX] expand extload/truncstore for vectors of floats	Jingyue Wu	2015-07-01	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: According to PTX ISA: For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type. So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of. As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like ld.v2.f32 {%fd3, %fd4}, [%rd2] This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated. Patched by Gang Hu. Test Plan: Test case attached. Reviewers: jingyue Reviewed By: jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D10876 llvm-svn: 241191
*	[NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass	Jingyue Wu	2015-07-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Offset of frame index is calculated by NVPTXPrologEpilogPass. Before that the correct offset of stack objects cannot be obtained, which leads to wrong offset if there are more than 2 frame objects. This patch move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check VRFrame register instead, and try to remove the VRFrame if there is no usage after NVPTXPeephole pass. Patched by Xuetian Weng. Test Plan: Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the offset calculation based on SP and SPL. Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10853 llvm-svn: 241185
*	[PPC64LE] Enable missing lxvdsx optimization, and related swap optimization	Bill Schmidt	2015-07-01	1	-6/+288
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When adding little-endian vector support for PowerPC last year, I inadvertently disabled an optimization that recognizes a load-splat idiom and generates the lxvdsx instruction. This patch moves the offending logic so lxvdsx is once again generated. This pattern is frequently generated by the vectorizer for scalar loads of an effective constant. Previously the lxvdsx instruction was wrongly listed as lane-sensitive for the VSX swap optimization (since both doublewords are identical, swaps are safe). This patch fixes this as well, so that vectorized code using lxvdsx can now have swaps removed from the computation. There is an existing test (@test50) in test/CodeGen/PowerPC/vsx.ll that checks for the missing optimization. However, vsx.ll was only being tested for POWER7 with big-endian code generation. I've added a little-endian RUN statement and expected LE code generation for all the tests in vsx.ll to give us a bit better VSX coverage, including what's needed for this patch. llvm-svn: 241183
*	add a cl::opt override for TargetLoweringBase's JumpIsExpensive	Sanjay Patel	2015-07-01	1	-11/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is not intended to change existing codegen behavior for any target. It just exposes the JumpIsExpensive setting on the command-line to allow for easier testing and emergency overrides. Also, change the existing regression test to use FileCheck, explicitly specify the jump-is-expensive option, and use more precise checks. Differential Revision: http://reviews.llvm.org/D10846 llvm-svn: 241179
*	Revert "[DWARF] Fix debug info generation for function static variables, ↵	David Blaikie	2015-07-01	1	-129/+0
\| \| \| \| \| \| \| \| \| \|	typedefs, and records" Caused PR24008 This reverts commit 37cb5f1c2db9f42d29f26b215585f56bb64ae4f5. llvm-svn: 241176
*	[SEH] Don't assert if the parent function lacks a personality	Reid Kleckner	2015-07-01	1	-0/+33
\| \| \| \| \| \| \| \|	The EH code might have been deleted as unreachable and the personality pruned while the filter is still present. Currently I'm hitting this at -O0 due to the clang bug PR24009. llvm-svn: 241170
*	[AArch64] Implement add/adds/sub/subs/cmp/cmn with negative immediate aliases	Arnaud A. de Grandmaison	2015-07-01	2	-5/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches the AsmParser to accept add/adds/sub/subs/cmp/cmn with a negative immediate operand and convert them as shown: add Rd, Rn, -imm -> sub Rd, Rn, imm sub Rd, Rn, -imm -> add Rd, Rn, imm adds Rd, Rn, -imm -> subs Rd, Rn, imm subs Rd, Rn, -imm -> adds Rd, Rn, imm cmp Rn, -imm -> cmn Rn, imm cmn Rn, -imm -> cmp Rn, imm Those instructions are an alternate syntax available to assembly coders, and are needed in order to support code already compiling with some other assemblers (gas). They are documented in the "ARMv8 Instruction Set Overview", in the "Arithmetic (immediate)" section. This makes llvm-mc a programmer-friendly assembler ! This also fixes PR20978: "Assembly handling of adding negative numbers not as smart as gas". llvm-svn: 241166
*	Test committed in r241153 is more target-specific than I thought.	Michael Kuperstein	2015-07-01	1	-1/+1
\| \| \| \| \| \|	Moving the (original, x86-only) test to the X86 directory. llvm-svn: 241162
*	AVX-512: Implemented missing encoding for FMA scalar instructions	Igor Breger	2015-07-01	2	-1/+1279
\| \| \| \| \| \| \| \|	Added tests for encoding Differential Revision: http://reviews.llvm.org/D10865 llvm-svn: 241159
*	Fix non-target-specific test not to use the x86 triple.	Michael Kuperstein	2015-07-01	1	-1/+1
\| \| \| \|	llvm-svn: 241158
*	Return ErrorOr from getSection.	Rafael Espindola	2015-07-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This also improves the logic of what is an error: * getSection(uint_32): only return an error if the index is out of bounds. The index 0 corresponds to a perfectly valid entry. * getSection(Elf_Sym): Returns null for symbols that normally don't have sections and error for out of bound indexes. In many places this just moves the report_fatal_error up the stack, but those can then be fixed in smaller patches. llvm-svn: 241156
*	[DWARF] Fix debug info generation for function static variables, typedefs, ↵	Michael Kuperstein	2015-07-01	1	-0/+129
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and records Function static variables, typedefs and records (class, struct or union) declared inside a lexical scope were associated with the function as their parent scope, rather than the lexical scope they are defined or declared in. This fixes PR19238 Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D9758 llvm-svn: 241153
*	[X86] Avoid over-relaxation of 8-bit immediates in integer arithmetic ↵	Michael Kuperstein	2015-07-01	2	-0/+194
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions. Only consider an instruction a candidate for relaxation if the last operand of the instruction is an expression. We previously checked whether any operand is an expression, which is useless, since for all instructions concerned, the only operand that may be affected by relaxation is the last one. In addition, this removes the check for having RIP as an argument, since it was plain wrong - even when one of the arguments is RIP, relaxation may still be needed. This fixes PR9807. Patch by: david.l.kreitzer@intel.com Differential Revision: http://reviews.llvm.org/D10766 llvm-svn: 241152
*	Revert part of r241149, "Fix PR23872: Integrated assembler error message ↵	NAKAMURA Takumi	2015-07-01	1	-28/+0
\| \| \| \| \| \| \| \|	when using .type directive with @ in AArch32 assembly." The test should be split among targets. llvm/test/MC/ELF/ is assumed as X86. llvm-svn: 241151
*	[mips][microMIPS] Implement SLL and NOP instructions	Zoran Jovanovic	2015-07-01	2	-0/+6
\| \| \| \| \| \|	http://reviews.llvm.org/D10474 llvm-svn: 241150
*	Fix PR23872: Integrated assembler error message when using .type directive ↵	Gabor Ballabas	2015-07-01	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \|	with @ in AArch32 assembly. The AArch32 assembler parses the '@' as a comment symbol, so the error message shouldn't suggest that '@<type>' is a valid replacement when assembling for AArch32 target. Differential Revision: http://reviews.llvm.org/D10651 llvm-svn: 241149
*	[LoopUnroll] Use undef for phis with no value live	David Majnemer	2015-07-01	1	-0/+24
\| \| \| \| \| \| \| \|	We would create a phi node with a zero initialized operand instead of undef in the case where no value was originally available. This was problematic for x86_mmx which has no null value. llvm-svn: 241143
*	[SCCP] Turn loads of null into undef instead of zero initialized values	David Majnemer	2015-07-01	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Surprisingly, this is a correctness issue: the mmx type exists for calling convention purposes, LLVM doesn't have a zero representation for them. This partially fixes PR23999. llvm-svn: 241142
*	[NaryReassociate] enhances nsw by leveraging @llvm.assume	Jingyue Wu	2015-07-01	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: nsw are flaky and can often be removed by optimizations. This patch enhances nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate now understands that assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE. Test Plan: nary-gep.ll Reviewers: broune, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10822 llvm-svn: 241139
*	[SanitizerCoverage] Don't add instrumentation to unreachable blocks.	Alexey Samsonov	2015-06-30	1	-0/+9
\| \| \| \|	llvm-svn: 241127