bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	X86AsmParser AVX-512: Return error instead of hitting assert	Craig Topper	2019-02-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	When parsing a sequence of tokens beginning with {, it will hit an assert and crash if the token afterwards is not an identifier. Instead of this, return a more verbose error as seen elsewhere in the function. Patch by Brandon Jones (BrandonTJones) Differential Revision: https://reviews.llvm.org/D57375 llvm-svn: 354356
*	[X86] Filter out tuning feature flags and a few ISA feature flags when ↵	Craig Topper	2019-02-19	2	-4/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	checking for function inline compatibility. Tuning flags don't have any effect on the available instructions so aren't a good reason to prevent inlining. There are also some ISA flags that don't have any intrinsics our ABI requirements that we can exclude. I've put only the most basic ones like cmpxchg16b and lahfsahf. These are interesting because they aren't present in all 64-bit CPUs, but we have codegen workarounds when they aren't present. Loosening these checks can help with scenarios where a caller has a more specific CPU than a callee. The default tuning flags on our generic 'x86-64' CPU can currently make it inline compatible with other CPUs. I've also added an example test for 'nocona' and 'prescott' where 'nocona' is just a 64-bit capable version of 'prescott' but in 32-bit mode they should be completely compatible. I've based the implementation here of the similar code in AMDGPU. Differential Revision: https://reviews.llvm.org/D58371 llvm-svn: 354355
*	GlobalISel: Implement moreElementsVector for select	Matt Arsenault	2019-02-19	2	-18/+21
\| \| \| \|	llvm-svn: 354354
*	GlobalISel: Implement moreElementsVector for G_EXTRACT source	Matt Arsenault	2019-02-19	2	-0/+8
\| \| \| \|	llvm-svn: 354348
*	[X86][AVX] Update VBROADCAST folds to always use v2i64 X86vzload	Simon Pilgrim	2019-02-19	2	-3/+3
\| \| \| \| \| \| \| \|	The VBROADCAST combines and SimplifyDemandedVectorElts improvements mean that we now more consistently use shorter (128-bit) X86vzload input operands. Follow up to D58053 llvm-svn: 354346
*	GlobalISel: Implement moreElementsVector for bit ops	Matt Arsenault	2019-02-19	2	-0/+60
\| \| \| \|	llvm-svn: 354345
*	[yaml2obj][obj2yaml] Remove section type range markers from allowed mappings ↵	James Henderson	2019-02-19	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and support hex values yaml2obj/obj2yaml previously supported SHT_LOOS, SHT_HIOS, and SHT_LOPROC for section types. These are simply values that delineate a range and don't really make sense as valid values. For example if a section has type value 0x70000000, obj2yaml shouldn't print this value as SHT_LOPROC. Additionally, this was missing the three other range markers (SHT_HIPROC, SHT_LOUSER and SHT_HIUSER). This change removes these three range markers. It also adds support for specifying the type as an integer, to allow section types that LLVM doesn't know about. Reviewed by: grimar Differential Revision: https://reviews.llvm.org/D58383 llvm-svn: 354344
*	Cast from SDValue directly instead of superfluous getNode(). NFCI.	Simon Pilgrim	2019-02-19	1	-2/+2
\| \| \| \|	llvm-svn: 354343
*	GlobalISel: Verify g_insert	Matt Arsenault	2019-02-19	1	-0/+24
\| \| \| \|	llvm-svn: 354342
*	[X86][AVX] EltsFromConsecutiveLoads - Add BROADCAST lowering support	Simon Pilgrim	2019-02-19	1	-3/+70
\| \| \| \| \| \| \| \| \| \|	This patch adds scalar/subvector BROADCAST handling to EltsFromConsecutiveLoads. It mainly shows codegen changes to 32-bit code which failed to handle i64 loads, although 64-bit code is also using this new path to more efficiently combine to a broadcast load. Differential Revision: https://reviews.llvm.org/D58053 llvm-svn: 354340
*	[yaml2obj][obj2yaml] - Support SHT_GNU_versym (.gnu.version) section.	George Rimar	2019-02-19	1	-0/+10
\| \| \| \| \| \| \| \| \|	This patch adds support for parsing dumping the .gnu.version section. Description of the section is: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symversion.html#SYMVERTBL Differential revision: https://reviews.llvm.org/D58280 llvm-svn: 354338
*	Recommit r354328, r354329 "[obj2yaml][yaml2obj] - Add support of ↵	George Rimar	2019-02-19	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parsing/dumping of the .gnu.version_r section." Fix: Replace assert(!IO.getContext() && "The IO context is initialized already"); with assert(IO.getContext() && "The IO context is not initialized"); (this was introduced in r354329, where I tried to quickfix the darwin BB and seems copypasted the assert from the wrong place). Original commit message: The section is described here: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html Patch just teaches obj2yaml/yaml2obj to dump and parse such sections. We did the finalization of string tables very late, and I had to move the logic to make it a bit earlier. That was needed in this patch since .gnu.version_r adds strings to .dynstr. This might also be useful for implementing other special sections. Everything else changed in this patch seems to be straightforward. Differential revision: https://reviews.llvm.org/D58119 llvm-svn: 354335
*	[RISCV][NFC] Move some std::string to StringRef	Alex Bradbury	2019-02-19	3	-5/+5
\| \| \| \|	llvm-svn: 354333
*	Revert r354328, r354329 "[obj2yaml][yaml2obj] - Add support of ↵	George Rimar	2019-02-19	1	-30/+0
\| \| \| \| \| \| \| \| \|	parsing/dumping of the .gnu.version_r section." Something went wrong. Bots are unhappy: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/44113/steps/test/logs/stdio llvm-svn: 354332
*	Fix BB after r354328.	George Rimar	2019-02-19	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bot: http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/30188/steps/build_Lld/logs/stdio Error: /Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:1013:15: error: unused variable 'Object' [-Werror,-Wunused-variable] const auto Object = static_cast<ELFYAML::Object >(IO.getContext()); ^ /Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:1023:15: error: unused variable 'Object' [-Werror,-Wunused-variable] const auto Object = static_cast<ELFYAML::Object >(IO.getContext()); Fix: change const auto Object = static_cast<ELFYAML::Object >(IO.getContext()); assert(Object && "The IO context is not initialized"); to assert(!IO.getContext() && "The IO context is initialized already"); llvm-svn: 354329
*	[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r ↵	George Rimar	2019-02-19	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	section. The section is described here: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html Patch just teaches obj2yaml/yaml2obj to dump and parse such sections. We did the finalization of string tables very late, and I had to move the logic to make it a bit earlier. That was needed in this patch since .gnu.version_r adds strings to .dynstr. This might also be useful for implementing other special sections. Everything else changed in this patch seems to be straightforward. Differential revision: https://reviews.llvm.org/D58119 llvm-svn: 354328
*	[NFC] API for signaling that the current loop is being deleted	Max Kazantsev	2019-02-19	1	-9/+30
\| \| \| \| \| \| \|	We are planning to be able to delete the current loop in LoopSimplifyCFG in the future. Add API to notify the loop pass manager that it happened. llvm-svn: 354314
*	[NFC] Store loop header in a local to keep it available after the loop is ↵	Max Kazantsev	2019-02-19	1	-11/+9
\| \| \| \| \| \|	deleted llvm-svn: 354313
*	[ARM GlobalISel] Support G_PHI for Thumb2	Diana Picus	2019-02-19	1	-5/+5
\| \| \| \| \| \|	Same as arm mode. llvm-svn: 354310
*	[X86] Remove command line strings from the ProcIntel* features.	Craig Topper	2019-02-19	1	-10/+10
\| \| \| \| \| \|	These should always follow the CPU string. There's no reason to control them independently. llvm-svn: 354304
*	[GlobalISel][AArch64] Legalize + select some llvm.ctlz.* intrinsics	Jessica Paquette	2019-02-18	2	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Legalize/select llvm.ctlz.* Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and arm64-vclz.ll to show that we perform valid transformations in optimized builds, and document where GISel can improve. Differential Revision: https://reviews.llvm.org/D58155 llvm-svn: 354299
*	[CGP] form usub with overflow from sub+icmp	Sanjay Patel	2019-02-18	3	-13/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivating x86 cases for forming the intrinsic are shown in PR31754 and PR40487: https://bugs.llvm.org/show_bug.cgi?id=31754 https://bugs.llvm.org/show_bug.cgi?id=40487 ..and those are shown in the IR test file and x86 codegen file. Matching the usubo pattern is harder than uaddo because we have 2 independent values rather than a def-use. This adds a TLI hook that should preserve the existing behavior for uaddo formation, but disables usubo formation by default. Only x86 overrides that setting for now although other targets will likely benefit by forming usbuo too. Differential Revision: https://reviews.llvm.org/D57789 llvm-svn: 354298
*	AMDGPU: Use MachineInstr::mayAlias to replace ↵	Changpeng Fang	2019-02-18	2	-21/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	areMemAccessesTriviallyDisjoint in LoadStoreOptimizer pass. Summary: This is to fix a memory dependence bug in LoadStoreOptimizer. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58295 llvm-svn: 354295
*	GlobalISel: Implement widenScalar for g_extract scalar results	Matt Arsenault	2019-02-18	2	-8/+51
\| \| \| \|	llvm-svn: 354293
*	GlobalISel: Make buildExtract use DstOp/SrcOp	Matt Arsenault	2019-02-18	1	-12/+15
\| \| \| \|	llvm-svn: 354292
*	GlobalISel: Fix double count of offset for irregular vector breakdowns	Matt Arsenault	2019-02-18	1	-1/+0
\| \| \| \| \| \| \|	Fixes cases with odd vectors that break into multiple requested size pieces. llvm-svn: 354280
*	[x86] split more v8f32/v8i32 shuffles in lowering	Sanjay Patel	2019-02-18	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to D57867 - this is a small patch with lots of test diffs. With half-vector-width narrowing potential, using an extract + 128-bit vshufps is a win because it replaces a 256-bit shuffle with a 128-bit shufle. This seems like it should be a win even for targets with 'fast-variable-shuffle', but we are intentionally deferring that to an independent change to make sure that is true. Differential Revision: https://reviews.llvm.org/D58181 llvm-svn: 354279
*	Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op"	Sanjay Patel	2019-02-18	1	-28/+17
\| \| \| \| \| \| \|	This reverts commit 079b610c29b4a428b3ae7b64dbac0378facf6632. Bots are failing after this change on a stage 2 compile of clang. llvm-svn: 354277
*	[InstCombine] reduce even more unsigned saturated add with 'not' op	Sanjay Patel	2019-02-18	1	-17/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354276
*	[MCA] Correctly update register definitions in the PRF after move elimination.	Andrea Di Biagio	2019-02-18	1	-14/+9
\| \| \| \| \| \| \| \| \| \|	This patch fixes a bug where register writes performed by optimizable register moves were sometimes wrongly treated like partial register updates. Before this patch, llvm-mca wrongly predicted a 1.50 IPC for test reg-move-elimination-6.s (added by this patch). With this patch, llvm-mca correctly updates the register defintions in the PRF, and the IPC for that test is now correctly reported as 2. llvm-svn: 354271
*	[MCA] Slightly refactor method writeStartEvent in WriteState and ReadState. NFCI	Andrea Di Biagio	2019-02-18	3	-13/+13
\| \| \| \| \| \| \|	This is another change in preparation for PR37494. No functional change intended. llvm-svn: 354261
*	Revert r354244 "[DAGCombiner] Eliminate dead stores to stack."	Clement Courbet	2019-02-18	5	-209/+45
\| \| \| \| \| \|	Breaks some bots. llvm-svn: 354245
*	[DAGCombiner] Eliminate dead stores to stack.	Clement Courbet	2019-02-18	5	-45/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A store to an object whose lifetime is about to end can be removed. See PR40550 for motivation. Reviewers: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57541 llvm-svn: 354244
*	[MC] Make SubtargetFeatureKV only store one FeatureBitset and use an ↵	Craig Topper	2019-02-18	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \|	'unsigned' to hold the value. This class is used for two difference tablegen generated tables. For one of the tables the Value FeatureBitset only has one bit set. For the other usage the Implies field was unused. This patch changes the Value field to just be an unsigned. For the usage that put a real vector in bitset, we now use the previously unused Implies field and leave the Value field unused instead. This is good for a 16K reduction in the size of llc on my local build with all targets enabled. llvm-svn: 354243
*	[LLVM-C] Add bindings to create enumerators	Robert Widmann	2019-02-17	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The C API don't have the bindings to create enumerators, needed to create an enumeration. Reviewers: whitequark, CodaFi, harlanhaskins, deadalnix Reviewed By: whitequark, CodaFi, harlanhaskins Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58323 llvm-svn: 354237
*	[X86] In FP_TO_INTHelper, when moving data from SSE register to X87 register ↵	Craig Topper	2019-02-17	1	-13/+9
\| \| \| \| \| \| \| \| \| \|	file via the stack, use the same stack slot we use for the integer conversion. No need for a separate stack slot. The lifetimes don't overlap. Also fix the MachinePointerInfo for the final load after the integer conversion to indicate it came from the stack slot. llvm-svn: 354234
*	[NFC] Teach getInnermostLoopFor walk up the loop trees	Max Kazantsev	2019-02-17	1	-6/+10
\| \| \| \| \| \| \|	This should be NFC in current use case of this method, but it will help to use it for solving more compex tasks in follow-up patches. llvm-svn: 354227
*	[SelectionDAG] Extract [US]MULO expansion into TL method; NFC	Nikita Popov	2019-02-17	2	-148/+124
\| \| \| \| \| \| \| \| \| \| \| \|	In preparation for supporting vector expansion. Add an isPostTypeLegalization flag to makeLibCall(), because this expansion relies on the legalized form using MERGE_VALUES. Drop the corresponding variant of ExpandLibCall, which is no longer used. Differential Revision: https://reviews.llvm.org/D58006 llvm-svn: 354226
*	[InstCombine] reduce more unsigned saturated add with 'not' op	Sanjay Patel	2019-02-17	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: not op %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %notx, %y %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: not op ugt %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %y, %notx %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/niom (The matching here is still incomplete.) llvm-svn: 354224
*	[InstCombine] reduce unsigned saturated add with 'not' op	Sanjay Patel	2019-02-17	1	-11/+28
\| \| \| \| \| \| \| \| \| \| \| \| \|	We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 (The matching here is incomplete. Trying to take minimal steps to make sure we don't induce infinite looping from existing canonicalizations of the 'select'.) llvm-svn: 354221
*	[NFC] Fix name and clarifying comment for factored-out function	Max Kazantsev	2019-02-17	1	-4/+5
\| \| \| \|	llvm-svn: 354220
*	[NFC] Factor out a function for future reuse	Max Kazantsev	2019-02-17	1	-8/+15
\| \| \| \|	llvm-svn: 354218
*	[LLVMSupport]: Remove a severely outdated README.	Kristina Brooks	2019-02-17	1	-43/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LLVM Support library implementation has resided in //llvm/lib/Support for a significant amount of time now, with documentation having been updated with all references to the "System library" being replaced with "Support library". Since this file mirrors already existing documentation available for Support library, includes dead links to documentation and still refers to it as "System library", having it there is confusing and updating it has very little point as it duplicates information in documentation, except documentation is a lot more up to date while this file has not been maintained. Up to date documentation concerning this can be found here: http://llvm.org/docs/SupportLibrary.html llvm-svn: 354209
*	[X86] When type legalizing the result of a i64 fp_to_uint on 32-bit targets. ↵	Craig Topper	2019-02-16	1	-27/+10
\| \| \| \| \| \| \| \| \| \|	Generate all of the ops as i64 and let them be legalized. No need to manually split everything. We can let the type legalizer work for us. The test change seems to be caused by some DAG ordering issue that was previously circumventing a one use check in LowerSELECT where FP selects are turned into blends if the setcc has one use. But it was running after an integer select and the same setcc had been legalized to cmov and X86SISD::CMP. This dropped the use count of the setcc, but wasn't what was intended. llvm-svn: 354197
*	[X86] Don't prevent load folding for cvtsi2ss/cvtsi2sd based on ↵	Craig Topper	2019-02-16	1	-39/+48
\| \| \| \| \| \| \| \| \| \|	hasPartialRegUpdate. Preventing the load fold won't fix the partial register update since the input we can fold is a GPR. So it will do nothing to prevent a false dependency on an XMM register. llvm-svn: 354193
*	[EarlyCSE & MSSA] Cap the clobbering calls in EarlyCSE.	Alina Sbirlea	2019-02-15	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Unlimitted number of calls to getClobberingAccess can lead to high compile times in pathological cases. Limitting getClobberingAccess to a fairly high number. Can be adjusted based on users/need. Note: this is the only user of MemorySSA currently enabled by default. The same handling exists in LICM (disabled atm). As MemorySSA gains more users, this logic of capping will need to move inside MemorySSA. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D58248 llvm-svn: 354182
*	[X86] Don't set exception mask bits when modifying FPCW to change rounding ↵	Craig Topper	2019-02-15	1	-14/+24
\| \| \| \| \| \| \| \| \| \| \| \|	mode for fp->int conversion When we need to do an fp->int conversion using x87 instructions, we need to temporarily change the rounding mode to 0b11 and perform a store. To do this we save the old value of the fpcw to the stack, then set the fpcw to 0xc7f, do the store, then restore fpcw. But the 0xc7f value forces the exception mask bits 1. While this is what they would be in the default FP environment, as we move to support changing the FP environments, we shouldn't make this assumption. This patch changes the code to explicitly OR 0xc00 with the old value so that only the rounding mode is changed. Unfortunately, this requires two stack temporaries instead of one. One to hold the old value and one to hold the new value. Without two stack temporaries we would need an additional GPR. We already need one to do the OR operation in. This is similar to what gcc and icc do for this operation. Though they are both better at reusing the stack temporaries when there are multiple truncates in a function(or at least in a basic block) Differential Revision: https://reviews.llvm.org/D57788 llvm-svn: 354178
*	[InstCombine] Address a couple stylistic issues pointed out by reviewer [NFC]	Philip Reames	2019-02-15	1	-6/+6
\| \| \| \| \| \|	Better addressing comments from https://reviews.llvm.org/D58290. llvm-svn: 354171
*	[InstCombine] Convert atomicrmws to xchg or store where legal	Philip Reames	2019-02-15	1	-15/+59
\| \| \| \| \| \| \| \| \| \|	Implement two more transforms of atomicrmw: 1) We can convert an atomicrmw which produces a known value in memory into an xchg instead. 2) We can convert an atomicrmw xchg w/o users into a store for some orderings. Differential Revision: https://reviews.llvm.org/D58290 llvm-svn: 354170
*	[X86] Fix LowerAsmOutputForConstraint.	Nirav Dave	2019-02-15	4	-11/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Update Flag when generating cc output. Fixes PR40737. Reviewers: rnk, nickdesaulniers, craig.topper, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58283 llvm-svn: 354163