bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[x86] eliminate redundant shuffle of horizontal math ops when both inputs ↵	Sanjay Patel	2017-09-01	1	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	are the same This is limited to a set of patterns based on the example in PR34111: https://bugs.llvm.org/show_bug.cgi?id=34111 ...but as I was investigating this, I see that horizontal patterns can go wrong in many, many other ways that would not be handled by this patch. Each data type may even go different in the DAG after starting with the same basic IR pattern, so even proper IR canonicalization won't fix it all. Differential Revision: https://reviews.llvm.org/D37357 llvm-svn: 312379
*	[llvm-pdbutil] Support dumping CodeView from object files.	Zachary Turner	2017-09-01	1	-32/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have llvm-readobj for dumping CodeView from object files, and llvm-pdbutil has always been more focused on PDB. However, llvm-pdbutil has a lot of useful options for summarizing debug information in aggregate and presenting high level statistical views. Furthermore, it's arguably better as a testing tool since we don't have to write tests to conform to a state-machine like structure where you match multiple lines in succession, each depending on a previous match. llvm-pdbutil dumps much more concisely, so it's possible to use single-line matches in many cases where as with readobj tests you have to use multi-line matches with an implicit state machine. Because of this, I'm adding object file support to llvm-pdbutil. In fact, this mirrors the cvdump tool from Microsoft, which also supports both object files and pdb files. In the future we could perhaps rename this tool llvm-cvutil. In the meantime, this allows us to deep dive into object files the same way we already can with PDB files. llvm-svn: 312358
*	[TTI] Fix getGEPCost() for geps with a single operand.	Davide Italiano	2017-09-01	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \|	Previously this would sporadically crash as TargetType was never initialized. We special-case the single-operand case returning earlier and trying to mimic the behaviour of isLegalAddressingMode as closely as possible. Differential Revision: https://reviews.llvm.org/D37277 llvm-svn: 312357
*	NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops	Daniel Berlin	2017-09-01	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context. Reviewers: davide, mcrosier Subscribers: sanjoy, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D37174 llvm-svn: 312352
*	AMDGPU: Add ds_{read\|write}_addtid_b32 definitions	Matt Arsenault	2017-09-01	1	-0/+8
\| \| \| \|	llvm-svn: 312349
*	LiveIntervalAnalysis: Fix alias regunit reserved definition	Matthias Braun	2017-09-01	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A register in CodeGen can be marked as reserved: In that case we consider the register always live and do not use (or rather ignore) kill/dead/undef operand flags. LiveIntervalAnalysis however tracks liveness per register unit (not per register). We already needed adjustments for this in r292871 to deal with super/sub registers. However I did not look at aliased register there. Looking at ARM: FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV (regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit (FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers. This shared register unit was previously considered non-reserved, however given that we uses of the reserved FPSCR potentially violate some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV reserved too and stop tracking liveness for it. This patch: - Defines a register unit as reserved when: At least for one root register, the root register and all its super registers are reserved. - Adjust LiveIntervals::computeRegUnitRange() for new reserved definition. - Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way of testing. - Stop computing LiveRanges for reserved register units in HMEditor even with UpdateFlags enabled. - Skip verification of uses of reserved reg units in the machine verifier (this usually didn't happen because there would be no cached liverange but there is no guarantee for that and I would run into this case before the HMEditor tweak, so may as well fix the verifier too). Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today; aliased registers are rarely used, the only other cases are hexagons P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved registers in an alias. Differential Revision: https://reviews.llvm.org/D37356 llvm-svn: 312348
*	AMDGPU: Add most d16 load/store instruction definitions	Matt Arsenault	2017-09-01	5	-0/+164
\| \| \| \| \| \| \|	Doesn't include the tied operand necessary for the loads, but is enough for the assembler to work. llvm-svn: 312347
*	[WebAssembly] Update relocation names to match spec	Sam Clegg	2017-09-01	8	-14/+14
\| \| \| \| \| \| \| \|	Summary: See https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md Differential Revision: https://reviews.llvm.org/D37385 llvm-svn: 312342
*	[WebAssembly] Fix getSymbolValue for exported globals	Sam Clegg	2017-09-01	3	-7/+20
\| \| \| \| \| \| \| \| \| \| \| \|	The code wasn't previously taking into account that the global index space is not same as the into in the Globals array since the latter does not include imported globals. This fixes the WebAssembly waterfall failures. Differential Revision: https://reviews.llvm.org/D37384 llvm-svn: 312340
*	AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states	Nicolai Haehnle	2017-09-01	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a bug that was exposed on gfx9 in various GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests, e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D36193 llvm-svn: 312337
*	[X86] Add test case I forgot to commit with r312285.	Craig Topper	2017-09-01	1	-0/+49
\| \| \| \|	llvm-svn: 312335
*	ModuleSummaryAnalysis: Correctly handle refs from function inline asm to ↵	Peter Collingbourne	2017-09-01	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	module inline asm. If a function contains inline asm and the module-level inline asm contains the definition of a local symbol, prevent the function from being imported in case the function-level inline asm refers to a symbol in the module-level inline asm. Differential Revision: https://reviews.llvm.org/D37370 llvm-svn: 312332
*	[LoopVectorizer] Use two step casting for float to pointer types.	Manoj Gupta	2017-09-01	2	-0/+132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LoopVectorizer is creating casts between vec<ptr> and vec<float> types on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a floating point type to a pointer type even if the types have same size causing a crash. Fix the crash using a two-step casting by bitcasting to integer and integer to pointer/float. Fixes PR33804. Reviewers: mkuper, Ayal, dlj, rengolin, srhines Reviewed By: rengolin Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35498 llvm-svn: 312331
*	[SCEV] Add URem support to SCEV	Alexandre Isoard	2017-09-01	2	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort using that relation: %r --> (-%b * (%t /u %b)) + %t We implement two special cases: - if %b is 1, the result is always 0 - if %b is a power-of-two, we produce a zext/trunc based expression instead That is, the following code: %r = urem i32 %t, 65536 Produces: %r --> (zext i16 (trunc i32 %a to i16) to i32) Note that while this helps get a tighter bound on the range analysis and the known-bits analysis, this exposes some normalization shortcoming of SCEVs: %div = udim i32 %a, 65536 %mul = mul i32 %div, 65536 %rem = urem i32 %a, 65536 %add = add i32 %mul, %rem Will usually not be reduced. llvm-svn: 312329
*	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"	Geoff Berry	2017-09-01	73	-320/+303
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issues addressed since original review: - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312328
*	Adding missing test case in rL312318	Strahinja Petrovic	2017-09-01	1	-0/+65
\| \| \| \| \| \| \| \| \|	Adding test for debug info for integer variables whose type is shrinked to bool. Patch by Nikola Prica. llvm-svn: 312325
*	[ARM] GlobalISel: Support ROPI global variables	Diana Picus	2017-09-01	3	-3/+198
\| \| \| \| \| \| \|	In the ROPI relocation model, read-only variables are accessed relative to the PC. We use the (MOV\|LDRLIT)_ga_pcrel pseudoinstructions for this. llvm-svn: 312323
*	Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that ↵	Clement Courbet	2017-09-01	3	-0/+190
\| \| \| \| \| \| \| \| \| \|	turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. llvm-svn: 312322
*	[ARM] Add 2-operand assembly aliases for Thumb1 ADD/SUB	Oliver Stannard	2017-09-01	2	-3/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds 2-operand assembly aliases for these instructions: add r0, r1 => add r0, r0, r1 sub r0, r1 => sub r0, r0, r1 Previously this syntax was only accepted for Thumb2 targets, where the wide versions of the instructions were used. This patch allows the 2-operand syntax to be used for Thumb1 targets, and selects the narrow encoding when it is used for Thumb2 targets. Differential revision: https://reviews.llvm.org/D37377 llvm-svn: 312321
*	[ARM] GlobalISel: More tests. NFC.	Diana Picus	2017-09-01	2	-2/+116
\| \| \| \| \| \| \| \|	Test constants as well in the PIC tests. These are also represented as G_GLOBAL_VALUE, and although they are treated just like other globals for PIC, they won't be for ROPI, so it's good to have this coverage. llvm-svn: 312319
*	Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains ↵	Clement Courbet	2017-09-01	3	-190/+0
\| \| \| \| \| \| \| \| \| \|	of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. llvm-svn: 312317
*	[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer	Clement Courbet	2017-09-01	3	-0/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 llvm-svn: 312315
*	[X86] Add isel patterns for memory forms of FMA3 intrinsic instructions	Craig Topper	2017-09-01	1	-48/+48
\| \| \| \|	llvm-svn: 312309
*	AMDGPU: Fold clamp modifier for packed instructions	Matt Arsenault	2017-08-31	1	-15/+187
\| \| \| \|	llvm-svn: 312297
*	[WebAssembly] Fix getSymbolValue() for data symbols	Sam Clegg	2017-08-31	4	-4/+4
\| \| \| \| \| \| \| \| \| \|	This is mostly a fix for the output of `llvm-nm` See Bug 34392: https://bugs.llvm.org//show_bug.cgi?id=34392 Differential Revision: https://reviews.llvm.org/D37359 llvm-svn: 312294
*	[WebAssembly] Refactor load ISel tablegen patterns into classes	Derek Schuff	2017-08-31	1	-5/+2
\| \| \| \| \| \| \| \| \|	Not all of these will be able to be used by atomics because tablegen, but it still seems like a good change by itself. Differential Revision: https://reviews.llvm.org/D37345 llvm-svn: 312287
*	[WebAssembly] Validate exports when parsing object files	Sam Clegg	2017-08-31	4	-0/+65
\| \| \| \| \| \| \| \|	Subscribers: jfb, dschuff, jgravelle-google, aheejin Differential Revision: https://reviews.llvm.org/D37358 llvm-svn: 312286
*	[llvm-nm] Fix output formatting of -f sysv for 64bit targets	Sam Clegg	2017-08-31	3	-0/+19
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37347 llvm-svn: 312284
*	[MachineOutliner] Recommit r312194, missed optimization remarks	Jessica Paquette	2017-08-31	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before, this commit caused a buildbot failure: http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/6026/steps/test_llvm/logs/LLVM%20%3A%3A%20CodeGen__AArch64__machine-outliner-remarks.ll This was caused by the Key value in DiagnosticInfoOptimizationBase being deallocated before emitting the remarks defined in MachineOutliner.cpp. As of r312277 this should no longer be an issue. llvm-svn: 312280
*	[x86] add more tests for horizontal ops; NFC	Sanjay Patel	2017-08-31	2	-19/+159
\| \| \| \|	llvm-svn: 312279
*	[llvm-pdbutil] Print detailed S_UDT stats.	Zachary Turner	2017-08-31	2	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a new command line option, -udt-stats, which breaks down the stats of S_UDT records. These are one of the biggest contributors to the size of /DEBUG:FASTLINK PDBs, so they need some additional tools to be able to analyze their usage. This option will dig into each S_UDT record and determine what kind of record it points to, and then break down the statistics by the target type. The goal here is to identify how our object files differ from MSVC object files in S_UDT records, so that we can output fewer of them and reach size parity. llvm-svn: 312276
*	[dsymutil] Don't mark forward declarations as canonical.	Jonas Devlieghere	2017-08-31	8	-0/+278
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch completes the work done by Frederic Riss to addresses dsymutil incorrectly considering forward declaration as canonical during uniquing. This resulted in references to the forward declaration even after the definition was encountered. In addition to the test provided by Alexander Shaposhnikov in D29609, I added another test to cover several scenarios that were mentioned in his conversation with Fred. We now also check that uniquing still occurs after the definition was encountered. For more context please refer to D29609 Differential revision: https://reviews.llvm.org/D37127 llvm-svn: 312274
*	Revert "[dsymutil] Don't mark forward declarations as canonical."	Jonas Devlieghere	2017-08-31	8	-278/+0
\| \| \| \| \| \|	This reverts commit r312264. llvm-svn: 312271
*	[ObjCARC] Pass the correct BasicBlock to fix assertion failure.	Akira Hatanaka	2017-08-31	1	-0/+19
\| \| \| \| \| \| \| \| \| \|	The BasicBlock passed to FindPredecessorRetainWithSafePath should be the parent block of Autorelease. This fixes a crash that occurs in FindDependencies when StartInst is not in StartBB. rdar://problem/33866381 llvm-svn: 312266
*	[dsymutil] Don't mark forward declarations as canonical.	Jonas Devlieghere	2017-08-31	8	-0/+278
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch completes the work done by Frederic Riss to addresses dsymutil incorrectly considering forward declaration as canonical during uniquing. This resulted in references to the forward declaration even after the definition was encountered. In addition to the test provided by Alexander Shaposhnikov in D29609, I added another test to cover several scenarios that were mentioned in his conversation with Fred. We now also check that uniquing still occurs after the definition was encountered. For more context please refer to D29609 Differential revision: https://reviews.llvm.org/D37127 llvm-svn: 312264
*	[llvm-dwarfdump] Brief mode only dumps debug_info by default	Jonas Devlieghere	2017-08-31	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the default behavior in brief mode to only show the debug_info section. This is undoubtedly the most popular and likely the one you'd want in brief mode. Non-brief mode behavior is not affected and still defaults to all. Differential revision: https://reviews.llvm.org/D37334 llvm-svn: 312252
*	[InstCombine] improve demanded vector elements analysis of insertelement	Sanjay Patel	2017-08-31	2	-14/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Recurse instead of returning on the first found optimization. Also, return early in the caller instead of continuing because that allows another round of simplification before we might potentially lose undef information from a shuffle mask by eliminating the shuffle. As noted in the review, we could probably do better and be more efficient by moving all of demanded elements into a separate pass, but this is yet another quick fix to instcombine. Differential Revision: https://reviews.llvm.org/D37236 llvm-svn: 312248
*	[codeview] Generalize DIExpression parsing to handle load chains	Reid Kleckner	2017-08-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Hopefully this also clarifies exactly when and why we're rewriting certiain S_LOCALs using reference types: We're using the reference type to stand in for a zero-offset load. Reviewers: inglorion Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37309 llvm-svn: 312247
*	Update test:	Konstantin Zhuravlyov	2017-08-31	1	-1/+1
\| \| \| \| \| \| \| \|	- REQUIRES: x86_64-linux -> REQUIRES: shell Differential Revision: https://reviews.llvm.org/D37316 llvm-svn: 312245
*	Revert r311525: "[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove ↵	Daniel Jasper	2017-08-31	11	-61/+85
\| \| \| \| \| \| \| \|	synthetic references in .text" Breaks builds internally. Will forward repo instructions to author. llvm-svn: 312243
*	[X86] Added run line to intrinsics upgrade test. NFC.	Yael Tsafrir	2017-08-31	1	-0/+1
\| \| \| \|	llvm-svn: 312241
*	AMD family 17h (znver1) scheduler model update.	Ashutosh Nema	2017-08-31	19	-654/+654
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch enables the following: 1) Regex based Instruction itineraries for integer instructions. 2) The instructions are grouped as per the nature of the instructions (move, arithmetic, logic, Misc, Control Transfer). 3) FP instructions and their itineraries are added which includes values for SSE4A, BMI, BMI2 and SHA instructions. Patch by Ganesh Gopalasubramanian Reviewers: RKSimon, craig.topper Subscribers: vprasad, shivaram, ddibyend, andreadb, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D36617 llvm-svn: 312237
*	[Object] Verify object sizes before handing out StringRefs pointing out	Benjamin Kramer	2017-08-31	2	-0/+6
\| \| \| \| \| \| \| \| \|	of bounds. This can only happen on corrupt input. Found by OSS-FUZZ! https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3228 llvm-svn: 312235
*	[AArch64] v8.3-a complex number support	Sam Parker	2017-08-31	2	-0/+249
\| \| \| \| \| \| \| \| \| \| \| \| \|	New instructions are added to AArch32 and AArch64 to aid floating-point multiplication and addition of complex numbers, where the complex numbers are packed in a vector register as a pair of elements. The Imaginary part of the number is placed in the more significant element, and the Real part of the number is placed in the less significant element. Differential Revision: https://reviews.llvm.org/D36792 llvm-svn: 312228
*	[llvm-cov] Read in function names for filtering from a text file.	Sean Eveson	2017-08-31	6	-0/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add a -name-whitelist option, which behaves in the same way as -name, but it reads in multiple function names from the given input file(s). Reviewers: vsk Reviewed By: vsk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37111 llvm-svn: 312227
*	[AArch64] IDSAR6 register assembler support	Sam Parker	2017-08-31	2	-0/+13
\| \| \| \| \| \| \| \| \| \|	The IDSAR6 system register has been introduced to identify the v8.3-a Javascript data type conversion and v8.2-a dot product support. Differential Revision: https://reviews.llvm.org/D37068 llvm-svn: 312225
*	[AArch64] Support COFF linker directives	Martin Storsjo	2017-08-31	1	-0/+74
\| \| \| \| \| \| \| \| \| \|	This is similar to what was done for ARM in SVN r269574; the code and the test are straight copypaste to the corresponding AArch64 code and test directory. Differential revision: https://reviews.llvm.org/D37204 llvm-svn: 312223
*	[IRCE] Identify loops with latch comparison against current IV value	Max Kazantsev	2017-08-31	1	-0/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current implementation of parseLoopStructure interprets the latch comparison as a comarison against `iv.next`. If the actual comparison is made against the `iv` current value then the loop may be rejected, because this misinterpretation leads to incorrect evaluation of the latch start value. This patch teaches the IRCE to distinguish this kind of loops and perform the optimization for them. Now we use `IndVarBase` variable which can be either next or current value of the induction variable (previously we used `IndVarNext` which was always the value on next iteration). Differential Revision: https://reviews.llvm.org/D36215 llvm-svn: 312221
*	Revert r312194: "[MachineOutliner] Add missed optimization remarks for the ↵	Daniel Jasper	2017-08-31	1	-73/+0
\| \| \| \| \| \| \| \| \|	outliner." Breaks on buildbot: http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/6026/steps/test_llvm/logs/LLVM%20%3A%3A%20CodeGen__AArch64__machine-outliner-remarks.ll llvm-svn: 312219
*	Temporarily revert "Update branch coalescing to be a PowerPC specific pass"	Eric Christopher	2017-08-31	2	-44/+21
\| \| \| \| \| \| \| \|	From comments and code review it wasn't intended to be enabled by default yet. This reverts commit r311588. llvm-svn: 312214