bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[GVN] Make a test case more robust	Sanjoy Das	2015-10-28	1	-12/+12
\| \| \| \| \| \| \| \| \| \|	The singleton !range metadata gets simplified more aggressively after a later change, so change the !range metadata to contain more than one element. While at it, turn some `; CHECK` s to `; CHECK-LABEL:` s. llvm-svn: 251485
*	[SimplifyCFG] Don't DCE catchret because the successor is unreachable	David Majnemer	2015-10-27	1	-0/+20
\| \| \| \| \| \| \|	CatchReturnInst has side-effects: it runs a destructor. This destructor could conceivably run forever/call exit/etc. and should not be removed. llvm-svn: 251461
*	[Bitcode] Fix accidental syntax errors in compatibility tests	Vedant Kumar	2015-10-27	2	-38/+38
\| \| \| \| \| \| \| \| \|	We used automated tools to update our IR to its current syntax in commit 21f77df7(r247378). While it correctly updated the CHECK lines in our compatibility tests, the IR should have remained untouched. This commit fixes the syntax errors. llvm-svn: 251458
*	[X86][AVX512] Test UNPCK with non-sequential scalars	Simon Pilgrim	2015-10-27	2	-4/+112
\| \| \| \| \| \|	Missing tests for r251297 llvm-svn: 251453
*	[IR] Limit bits used for CallingConv::ID, update tests	Vedant Kumar	2015-10-27	6	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Use 10 bits to represent calling convention ID's instead of 13, and update the bitcode compatibility tests accordingly. We now error-out in the bitcode reader when we see bad calling conv ID's. Thanks to rnk and dexonsmith for feedback! Differential Revision: http://reviews.llvm.org/D13826 llvm-svn: 251452
*	[AliasSetTracker] Use mod/ref information for UnknownInstr	Hal Finkel	2015-10-27	1	-0/+40
\| \| \| \| \| \| \| \| \| \|	AliasSetTracker does not need to convert the access mode to ModRefAccess if the new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the sets. Patch by Andrew Zhogin! llvm-svn: 251451
*	Use the 'arcp' fast-math-flag when combining repeated FP divisors	Sanjay Patel	2015-10-27	1	-10/+59
\| \| \| \| \| \| \| \| \| \| \| \|	This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. This was originally part of D8900. Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and possibly other changes. Differential Revision: http://reviews.llvm.org/D9708 llvm-svn: 251450
*	[ScalarEvolutionExpander] PHI on a catchpad can be used on both edges	David Majnemer	2015-10-27	1	-0/+42
\| \| \| \| \| \| \| \|	A PHI on a catchpad might be used by both edges out of the catchpad, feeding back into a loop. In this case, just use the insertion point. Anything more clever would require new basic blocks or PHI placement. llvm-svn: 251442
*	[AArch64]Merge halfword loads into a 32-bit load	Jun Bum Lim	2015-10-27	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This recommits r250719, which caused a failure in SPEC2000.gcc because of the incorrect insert point for the new wider load. Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 251438
*	Revert r251291, "Loop Vectorizer - skipping "bitcast" before GEP"	NAKAMURA Takumi	2015-10-27	2	-125/+86
\| \| \| \| \| \| \|	It causes miscompilation of llvm/lib/ExecutionEngine/Interpreter/Execution.cpp. See also PR25324. llvm-svn: 251436
*	Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add ↵	Cong Hou	2015-10-27	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429
*	[SLP] Be more aggressive about reduction width selection.	Charlie Turner	2015-10-27	1	-0/+123
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach. It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize. I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT. The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something! Reviewers: nadav, jmolloy Subscribers: mssimpso, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D14116 llvm-svn: 251428
*	[SLP] Try a bit harder to find reduction PHIs	Charlie Turner	2015-10-27	1	-0/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads. There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT. Reviewers: jmolloy, mcrosier, nadav Subscribers: mssimpso, nadav, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14063 llvm-svn: 251425
*	[SLP] Treat SelectInsts as reduction values.	Charlie Turner	2015-10-27	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values. The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here. Reviewers: jmolloy, mzolotukhin, spatel, nadav Subscribers: nadav, mssimpso, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D13949 llvm-svn: 251424
*	Fix SamplePGO segfault when debug info is missing.	Diego Novillo	2015-10-27	2	-0/+41
\| \| \| \| \| \| \| \| \| \|	When emitting a remark for a conditional branch annotation, the remark uses the line location information of the conditional branch in the message. In some cases, that information is unavailable and the optimization would segfaul. I'm still not sure whether this is a bug or WAI, but the optimizer should not die because of this. llvm-svn: 251420
*	[X86][AVX512] [X86][AVX512] add convert float to half	Asaf Badouh	2015-10-27	3	-0/+171
\| \| \| \| \| \| \| \|	convert float to half with mask/maskz for the reg to reg version and mask for the reg to mem version (there is no maskz version for reg to mem). Differential Revision: http://reviews.llvm.org/D14113 llvm-svn: 251409
*	[ARM] Expand ROTL and ROTR of vector value types	Charlie Turner	2015-10-27	2	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection. Reviewers: rengolin, t.p.northover Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14082 llvm-svn: 251401
*	[ScalarEvolutionExpander] Properly insert no-op casts + EH Pads	David Majnemer	2015-10-27	1	-0/+148
\| \| \| \| \| \| \| \| \| \| \|	We want to insert no-op casts as close as possible to the def. This is tricky when the cast is of a PHI node and the BasicBlocks between the def and the use cannot hold any instructions. Iteratively walk EH pads until we hit a non-EH pad. This fixes PR25326. llvm-svn: 251393
*	[X86] Make elfiamcu an OS, not an environment.	Michael Kuperstein	2015-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	GNU tools require elfiamcu to take up the entire OS field, so, e.g. i?86-*-linux-elfiamcu is not considered a legal triple. Make us compatible. Differential Revision: http://reviews.llvm.org/D14081 llvm-svn: 251390
*	[x86] replace integer logic ops with packed SSE FP logic ops	Sanjay Patel	2015-10-27	1	-18/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have an operand to a bitwise logic op that's already in an XMM register and the result is going to be sent to an XMM register, then use an SSE logic op to avoid moves between the integer and vector register files. Related commits: http://reviews.llvm.org/rL248395 http://reviews.llvm.org/rL248399 http://reviews.llvm.org/rL248404 http://reviews.llvm.org/rL248409 http://reviews.llvm.org/rL248415 This should solve PR22428: https://llvm.org/bugs/show_bug.cgi?id=22428 llvm-svn: 251378
*	Fix llc crash processing S/UREM for -Oz builds caused by rL250825.	Steve King	2015-10-27	1	-0/+257
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When taking the remainder of a value divided by a constant, visitREM() attempts to convert the REM to a longer but faster sequence of instructions. This conversion calls combine() on a speculative DIV instruction. Commit rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes. Flow eventually hits unreachable(). This patch adds a test case and a check to prevent visitREM() from trying to convert the REM instruction in cases where a DIVREM is possible. See http://reviews.llvm.org/D14035 llvm-svn: 251373
*	add FP logic test cases to show current codegen (PR22428)	Sanjay Patel	2015-10-26	1	-0/+60
\| \| \| \|	llvm-svn: 251370
*	[x86] Make the vselect-minmax test 2x to 3x faster by deleting all the	Chandler Carruth	2015-10-26	1	-4032/+960
\| \| \| \| \| \| \|	instructions that aren't relevant for instruction selection of vector min and max. llvm-svn: 251366
*	ARM: make sure VFP loads and stores are properly aligned.	Tim Northover	2015-10-26	1	-0/+98
\| \| \| \| \| \| \|	Both VLDRS and VLDRD fault if the memory is not 4 byte aligned, which wasn't really being checked before, leading to faults at runtime. llvm-svn: 251352
*	Fix tests.	Peter Collingbourne	2015-10-26	1	-1/+1
\| \| \| \|	llvm-svn: 251343
*	[LLVMSymbolize] Use symbol table only if function linkage name was requested.	Alexey Samsonov	2015-10-26	1	-1/+1
\| \| \| \| \| \| \|	Now it's enough to just specify -functions=short without additionally providing -use-symbol-table=false. llvm-svn: 251339
*	[RS4GC] Strip noalias attribute after statepoint rewrite	Igor Laevsky	2015-10-26	2	-1/+65
\| \| \| \| \| \| \| \| \|	We should remove noalias along with dereference and dereference_or_null attributes because statepoint could potentially touch the entire heap including noalias objects. Differential Revision: http://reviews.llvm.org/D14032 llvm-svn: 251333
*	SamplePGO - Add optimization reports.	Diego Novillo	2015-10-26	2	-0/+192
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a couple of optimization remarks to the SamplePGO transformation. When it decides to inline a hot function (to mimic the inline stack and repeat useful inline decisions in the original build). It will also report branch destinations. For instance, given the code fragment: 6 if (i < 1000) 7 sum -= i; 8 else 9 sum += -i * rand(); If the 'else' branch is taken most of the time, building this code with -Rpass=sample-profile will produce: a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile] sum += -i * rand(); ^ llvm-svn: 251330
*	Add an (optional) identification block in the bitcode	Mehdi Amini	2015-10-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Processing bitcode from a different LLVM version can lead to unexpected behavior. The LLVM project guarantees autoupdating bitcode from a previous minor revision for the same major, but can't make any promise when reading bitcode generated from a either a non-released LLVM, a vendor toolchain, or a "future" LLVM release. This patch aims at being more user-friendly and allows a bitcode produce to emit an optional block at the beginning of the bitcode that will contains an opaque string intended to describe the bitcode producer information. The bitcode reader will dump this information alongside any error it reports. The optional block also includes an "epoch" number, monotonically increasing when incompatible changes are made to the bitcode. The reader will reject bitcode whose epoch is different from the one expected. Differential Revision: http://reviews.llvm.org/D13666 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251325
*	[safestack] Fast access to the unsafe stack pointer on AArch64/Android.	Evgeniy Stepanov	2015-10-26	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324
*	ARM/ELF: Better codegen for global variable addresses.	Peter Collingbourne	2015-10-26	4	-66/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In PIC mode we were previously computing global variable addresses (or GOT entry addresses) by adding the PC, the PC-relative GOT displacement and the GOT-relative symbol/GOT entry displacement. Because the latter two displacements are fixed, we ended up performing one more addition than necessary. This change causes us to compute addresses using a single PC-relative displacement, resulting in a shorter code sequence. This reduces code size by about 4% in a recent build of Chromium for Android. As a result of this change we no longer need to compute the GOT base address in the ARM backend, which allows us to remove the Global Base Reg pass and SDAG lowering for the GOT. We also now no longer use the GOT when addressing a symbol which is known to be defined in the same linkage unit. Specifically, the symbol must have either hidden visibility or a strong definition in the current module in order to not use the the GOT. This is a change from the previous behaviour where we would use the GOT to address externally visible symbols defined in the same module. I think the only cases where this could matter are cases involving symbol interposition, but we don't really support that well anyway. Differential Revision: http://reviews.llvm.org/D13650 llvm-svn: 251322
*	Cleanup test case debug info. NFC.	Diego Novillo	2015-10-26	1	-1/+1
\| \| \| \|	llvm-svn: 251320
*	[SystemZ] LTGFR use regclass should be GR32, not GR64.	Jonas Paulsson	2015-10-26	1	-1/+2
\| \| \| \| \| \| \|	Discovered by testing int-cmp-44.ll with -verify-machineinstrs (added to test run). llvm-svn: 251299
*	[SystemZ] Also clear kill flag for index reg in splitMove().	Jonas Paulsson	2015-10-26	1	-1/+1
\| \| \| \| \| \| \|	Discovered by running fp-move-05.ll with -verify-machineinstrs (added to test case run). llvm-svn: 251298
*	[SystemZ] Don't forget the CC def op on LTEBRCompare pseudos	Jonas Paulsson	2015-10-26	1	-1/+1
\| \| \| \| \| \| \|	Discovered by running fp-cmp-02.ll with -verify-machineinstrs (now added to test run). llvm-svn: 251297
*	[SystemZ] Tie operands in SystemZShorteInst if MI becomes 2-address.	Jonas Paulsson	2015-10-26	1	-1/+1
\| \| \| \| \| \| \| \|	Discovered by testing fp-add-02.ll with -verify-machineinstrs. Test case updated to always run with -verify-machineinstrs. llvm-svn: 251296
*	[mips] Check for the correct error message in tests for interrupt attributes.	Vasileios Kalintiris	2015-10-26	6	-33/+27
\| \| \| \| \| \| \| \|	Instead of XFAIL-ing the tests with the wrong usage of the "interrupt" attribute, we should check that we emit the correct error messages to the user. llvm-svn: 251295
*	[ValueTracking] Extend r251146 to catch a fairly common case	James Molloy	2015-10-26	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Even though we may not know the value of the shifter operand, it's possible we know the shifter operand is non-zero. This can allow us to infer more known bits - for example: %1 = load %p !range {1, 5} %2 = shl %q, %1 We don't know %1, but we do know that it is nonzero so %2[0] is known zero, and importantly %2 is known non-zero. Calling isKnownNonZero is nontrivially expensive so use an Optional to run it lazily and cache its result. llvm-svn: 251294
*	Loop Vectorizer - skipping "bitcast" before GEP	Elena Demikhovsky	2015-10-26	2	-86/+125
\| \| \| \| \| \| \| \| \| \|	Vectorization of memory instruction (Load/Store) is possible when the pointer is coming from GEP. The GEP analysis allows to estimate the profit. In some cases we have a "bitcast" between GEP and memory instruction. I added code that skips the "bitcast". http://reviews.llvm.org/D13886 llvm-svn: 251291
*	Tests: be slightly more specific to avoid conflict with path.	Tim Northover	2015-10-26	1	-1/+1
\| \| \| \|	llvm-svn: 251290
*	fix test errors (on windows) for commit r251287	Igor Breger	2015-10-26	1	-2/+2
\| \| \| \|	llvm-svn: 251288
*	AVX512: Enabled VPBROADCASTB lowering for v64i8 vectors.	Igor Breger	2015-10-26	1	-96/+165
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D13896 llvm-svn: 251287
*	[mips] Interrupt attribute support for mips32r2+.	Vasileios Kalintiris	2015-10-26	4	-0/+277
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support for using the "interrupt" attribute on Mips for interrupt handling functions. At this time only mips32r2+ with the o32 ABI with the static relocation model is supported. Unsupported configurations will be rejected Patch by Simon Dardis (+ clang-format & some trivial changes to follow the LLVM coding standards by me). Reviewers: mpf, dsanders Subscribers: dsanders, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D10768 llvm-svn: 251286
*	AVX-512: Use correct extract vector length.	Igor Breger	2015-10-26	1	-0/+11
\| \| \| \| \| \| \| \|	Bug https://llvm.org/bugs/show_bug.cgi?id=25318 Differential Revision: http://reviews.llvm.org/D14062 llvm-svn: 251285
*	[InstCombine] Teach instcombine not to create extra PHI nodes when folding GEPs	Silviu Baranga	2015-10-26	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: InstCombine tries to transform GEP(PHI(GEP1, GEP2, ..)) into GEP(GEP(PHI(...)) when possible. However, this may leave the old PHI node around. Even if we do end up folding the GEPs, having an extra PHI node might not be beneficial. This change makes the transformation more conservative. We now only do this if the PHI has only one use, and can therefore be removed after the transformation. Reviewers: jmolloy, majnemer Subscribers: mcrosier, mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D13887 llvm-svn: 251281
*	[ARM] Handle the inline asm constraint type 'o'	James Molloy	2015-10-26	1	-0/+11
\| \| \| \| \| \|	This means "memory with offset" and requires very little plumbing to get working. This fixes PR25317. llvm-svn: 251280
*	AVX512: Add AVX-512 not materializable instructions.	Igor Breger	2015-10-26	1	-0/+34
\| \| \| \| \| \| \| \| \| \|	Otherwise value can be reused , despite its value could be changed - produces incorrect assembler. https://llvm.org/bugs/show_bug.cgi?id=25270 Differential Revision: http://reviews.llvm.org/D14057 llvm-svn: 251275
*	Update test to take into account for r251271.	David Majnemer	2015-10-26	1	-1/+1
\| \| \| \|	llvm-svn: 251272
*	[MC] Add support for GNU as-compatible binary operator precedence	David Majnemer	2015-10-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	GNU as and Darwin give the various binary operators different precedence. LLVM's MC supported the Darwin semantics but not the GNU semantics. This fixes PR25311. llvm-svn: 251271
*	[MC] Don't crash when .word is given bogus values	David Majnemer	2015-10-26	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	We didn't validate that the .word directive was given a sane value, leading to crashes when we attempt to write out the object file. Instead, perform some validation and issue a diagnostic pointing at the start of the diagnostic. llvm-svn: 251270