bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[WebAssembly][WIP] Expand operations not supported by SIMD	Thomas Lively	2019-03-02	1	-0/+17
\| \| \| \|	llvm-svn: 355247
*	[tblgen] Track CodeInit origins when possible	Daniel Sanders	2019-03-02	2	-8/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add an SMLoc to CodeInit that records the source line it originated from. This allows tablegen to point precisely at portions of code when reporting errors within the CodeInit. For example, in the upcoming GlobalISel combiner, it can report undefined expansions and point at the instance of the expansion. This is achieved using something like: SMLoc::getFromPointer(SMLoc::getPointer() + (StringRef - CodeInit::getValue())) The location is lost when producing a CodeInit by string concatenation so a fallback SMLoc is required (e.g. the Record::getLoc()) but that's pretty rare for CodeInits. There's a reasonable case for extending tracking of a couple other Init objects, for example StringInit's are often parsed and it would be good to point inside the string when reporting errors about that. However, location tracking also harms de-duplication. This is fine for CodeInit where there's only a few hundred of them (~160 for X86) and it may be worth it for StringInit (~86k up to ~1.9M for roughly 15MB increase for X86). However the origin tracking would be a _terrible_ idea for IntInit, BitInit, and UnsetInit. I haven't measured either of those three but BitInit would most likely be on the order of increasing the current 2 BitInit values up to billions. Reviewers: volkan, aditya_nandakumar, bogner, paquette, aemerson Reviewed By: paquette Subscribers: javed.absar, kristof.beyls, dexonsmith, llvm-commits, kristina Tags: #llvm Differential Revision: https://reviews.llvm.org/D58141 llvm-svn: 355245
*	Try to fix Windows bots after r355226.	Paul Robinson	2019-03-01	1	-1/+2
\| \| \| \| \| \|	Windows has two path separator characters. llvm-svn: 355235
*	[DWARFFormValue] Cleanup DWARFFormValue interface. (2/2) (NFC)	Jonas Devlieghere	2019-03-01	4	-21/+22
\| \| \| \| \| \| \|	Continues the work started in r354941. Changes (all but one) uses of the extractValue to static createFromData. llvm-svn: 355233
*	[DWARF] Make -g with empty assembler source work better.	Paul Robinson	2019-03-01	3	-12/+47
\| \| \| \| \| \| \| \| \| \| \| \| \|	This was sometimes causing clang or llvm-mc to crash, and in other cases could emit a bogus DWARF line-table header. I did an interim patch in r352541; this patch should be a cleaner and more complete fix, and retains the test. Addresses PR40538. Differential Revision: https://reviews.llvm.org/D58750 llvm-svn: 355226
*	[TableGen][SelectionDAG][X86] Add specific isel matchers for ↵	Craig Topper	2019-03-01	5	-59/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary. Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts. By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up. This removes something like 40,000 bytes from the X86 isel table. Differential Revision: https://reviews.llvm.org/D58595 llvm-svn: 355224
*	[ValueTracking] Known bits support for unsigned saturating add/sub	Nikita Popov	2019-03-01	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have two sources of known bits: 1. For adds leading ones of either operand are preserved. For sub leading zeros of LHS and leading ones of RHS become leading zeros in the result. 2. The saturating math is a select between add/sub and an all-ones/ zero value. As such we can carry out the add/sub known bits calculation, and only preseve the known one/zero bits respectively. Differential Revision: https://reviews.llvm.org/D58329 llvm-svn: 355223
*	[InstCombine] Extend saturating idempotent atomicrmw transform to FP	Philip Reames	2019-03-01	1	-2/+10
\| \| \| \| \| \| \| \|	I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw. Differential Revision: https://reviews.llvm.org/D58836 llvm-svn: 355222
*	[InstCombine] move add after umin/umax	Sanjay Patel	2019-03-01	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the motivating cases from PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 ...moving the add enables us to narrow the min/max which eliminates zext/trunc which enables signficantly better vectorization. But that bug is still not completely fixed. https://rise4fun.com/Alive/5KQ Name: umax Pre: C1 u>= C0 %a = add nuw i8 %x, C0 %cond = icmp ugt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp ugt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nuw i8 %u2, C0 Name: umin Pre: C1 u>= C0 %a = add nuw i32 %x, C0 %cond = icmp ult i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp ult i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nuw i32 %u2, C0 llvm-svn: 355221
*	Revert "[MIPS GlobalISel] Fix mul operands"	Vlad Tsyrklevich	2019-03-01	1	-4/+0
\| \| \| \| \| \| \|	This reverts commit r355178, it is causing ASan failures on the sanitizer bots. llvm-svn: 355219
*	[LICM] Infer proper alignment from loads during scalar promotion	Philip Reames	2019-03-01	1	-3/+23
\| \| \| \| \| \| \| \| \| \|	This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load. For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an incredibly rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we may fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually is well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit. Differential Revision: https://reviews.llvm.org/D58809 llvm-svn: 355217
*	[InstCombine] Extend "idempotent" atomicrmw optimizations to floating point	Philip Reames	2019-03-01	1	-2/+17
\| \| \| \| \| \| \| \| \| \|	An idempotent atomicrmw is one that does not change memory in the process of execution. We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR. Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load. As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future. Differential Revision: https://reviews.llvm.org/D58251 llvm-svn: 355210
*	Revert "[WebAssembly] Lower SIMD shifts since they are fixed in V8"	Thomas Lively	2019-03-01	1	-0/+5
\| \| \| \| \| \|	They weren't fixed in V8. Oops. llvm-svn: 355208
*	Hide two unused debugging methods, NFCI.	Jonas Hahnfeld	2019-03-01	2	-0/+8
\| \| \| \| \| \| \| \|	GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205
*	Try to fix NetBSD buildbot breakage introduced in D57463.	Manman Ren	2019-03-01	1	-0/+1
\| \| \| \| \| \|	By including the header file in the source. llvm-svn: 355202
*	[ARM] Fix FP16 stack loads/stores for Thumb2 with frame pointer	Oliver Stannard	2019-03-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	The new addressing mode added for the v8.2A FP16 instructions uses bit 8 of the immediate to encode the sign of the offset, like the other FP loads/stores, so need to be treated the same way. Differential revision: https://reviews.llvm.org/D58816 llvm-svn: 355201
*	[ARM] Consider undefined-on-NaN conditions in checkVSELConstraints	Oliver Stannard	2019-03-01	1	-5/+6
\| \| \| \| \| \| \| \| \| \|	This function was not checking for the condition code variants which are undefined if either input is NaN, so we were missing selection of the VSEL instruction in some cases when using -fno-honor-nans or -ffast-math. Differential revision: https://reviews.llvm.org/D58812 llvm-svn: 355199
*	[yaml2obj] - Allow setting custom sh_info for RawContentSection sections.	George Rimar	2019-03-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	This is for tweaking SHT_SYMTAB sections. Their sh_info contains the (number of symbols + 1) usually. But for creating invalid inputs for test cases it would be convenient to allow explicitly override this field from YAML. Differential revision: https://reviews.llvm.org/D58779 llvm-svn: 355193
*	[ARM GlobalISel] Support G_CTLZ for Thumb2	Diana Picus	2019-03-01	1	-7/+0
\| \| \| \| \| \|	Same as ARM mode but with different opcode. llvm-svn: 355191
*	[Tablegen] Add support for the !mul operator.	Nicola Zaghen	2019-03-01	4	-5/+15
\| \| \| \| \| \| \| \| \|	This is a small addition to arithmetic operations that improves expressiveness of the language. Differential Revision: https://reviews.llvm.org/D58775 llvm-svn: 355187
*	[CommandLine] Allow grouping options which can have values.	Igor Kudrin	2019-03-01	1	-21/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows all forms of values for options to be used at the end of a group. With the fix, it is possible to follow the way GNU binutils tools handle grouping options better. For example, the -j option can be used with objdump in any of the following ways: $ objdump -d -j .text a.o $ objdump -d -j.text a.o $ objdump -dj .text a.o $ objdump -dj.text a.o Differential Revision: https://reviews.llvm.org/D58711 llvm-svn: 355185
*	[CommandLine] Do not crash if an option has both ValueRequired and Grouping.	Igor Kudrin	2019-03-01	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \|	If an option, which requires a value, has a `cl::Grouping` formatting modifier, it works well as far as it is used at the end of a group, or as a separate argument. However, if the option appears accidentally in the middle of a group, the program just crashes. This patch prints an error message instead. Differential Revision: https://reviews.llvm.org/D58499 llvm-svn: 355184
*	[AMDGPU] Mark ds instructions as meybeAtomic	Stanislav Mekhanoshin	2019-03-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	These were not recognized as potential atomics by memory legalizer. The test was working not because legalizer did a right thing, but because it has skipped all these instructions. When I have fixed DS desciption test started to fail because region address has changed from 4 to 2 a while ago. Differential Revision: https://reviews.llvm.org/D58802 llvm-svn: 355179
*	[MIPS GlobalISel] Fix mul operands	Petar Avramovic	2019-03-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unsigned mul high for MIPS32 is selected into two PseudoInstructions: PseudoMULTu and PseudoMFHI that use accumulator register class ACC64 for some of its operands. Registers in this class have appropriate hi and lo register as subregisters: $lo0 and $hi0 are subregisters of $ac0 etc. mul instruction implicit-defs $lo0 and $hi0 according to MipsInstrInfo.td. In functions where mul and PseudoMULTu are present fastRegisterAllocator will "run out of registers during register allocation" because 'calcSpillCost' for $ac0 will return spillImpossible because subregisters $lo0 and $hi0 of $ac0 are reserved by mul instruction above. A solution is to mark implicit-defs of $lo0 and $hi0 as dead in mul instruction. Differential Revision: https://reviews.llvm.org/D58715 llvm-svn: 355178
*	[MIPS GlobalISel] Select G_UMULH	Petar Avramovic	2019-03-01	3	-1/+25
\| \| \| \| \| \| \| \|	Legalize G_UMULO and select G_UMULH for MIPS32. Differential Revision: https://reviews.llvm.org/D58714 llvm-svn: 355177
*	[ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid ↵	Fangrui Song	2019-03-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dangling elements in ConstIntInfoVec for new PM Summary: ConstIntInfoVec contains elements extracted from the previous function. In new PM, releaseMemory() is not called and the dangling elements can cause segfault in findConstantInsertionPoint. Rename releaseMemory() to cleanup() to deliver the idea that it is mandatory and call cleanup() in ConstantHoistingPass::runImpl to fix this. Reviewers: ormris, zzheng, dmgreen, wmi Reviewed By: ormris, wmi Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58589 llvm-svn: 355174
*	[Subtarget] Remove static global constructor call from the tablegened ↵	Craig Topper	2019-03-01	1	-18/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	subtarget feature tables Subtarget features are stored in a std::bitset that has been subclassed. There is a special constructor to allow the tablegen files to provide a list of bits to initialize the std::bitset to. This constructor isn't constexpr and std::bitset doesn't support many constexpr operations either. This results in a static global constructor being used to initialize the feature bitsets in these files at startup. To fix this I've introduced a new FeatureBitArray class that holds three 64-bit values representing the initial bit values and taught tablegen to emit hex constants for them based on the feature enum values. This makes the tablegen files less readable than they were before. I can add the list of features back as a comment if we think that's important. I've added a method to convert from this class into the std::bitset subclass we had before. I considered making the new FeatureBitArray class just implement the std::bitset interface we need instead, but thought I'd see how others felts about that first. I've simplified the interfaces to SetImpliedBits and ClearImpliedBits a little minimize the number of times we need to convert to the bitset. This removes about 27K from my local release+asserts build of llc. Differential Revision: https://reviews.llvm.org/D58520 llvm-svn: 355167
*	[WebAssembly] Lower SIMD shifts since they are fixed in V8	Thomas Lively	2019-03-01	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58800 llvm-svn: 355163
*	AMDGPU/GlobalISel: Implement select for G_INSERT	Tom Stellard	2019-03-01	2	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \|	Re-commit r344310. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53116 llvm-svn: 355159
*	[WebAssembly] Fix crash when @llvm.global_dtors is external	Thomas Lively	2019-03-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58799 llvm-svn: 355157
*	AMDGPU/GlobalISel: Implement select for G_EXTRACT	Tom Stellard	2019-02-28	3	-0/+32
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49714 llvm-svn: 355156
*	[PPC] Secure PLT only has meaning for PIC	Joerg Sonnenberger	2019-02-28	1	-2/+2
\| \| \| \|	llvm-svn: 355154
*	[sancov] Instrument reachable blocks that end in unreachable	Reid Kleckner	2019-02-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These sorts of blocks often contain calls to noreturn functions, like longjmp, throw, or trap. If they don't end the program, they are "interesting" from the perspective of sanitizer coverage, so we should instrument them. This was discussed in https://reviews.llvm.org/D57982. Reviewers: kcc, vitalybuka Subscribers: llvm-commits, craig.topper, efriedma, morehouse, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D58740 llvm-svn: 355152
*	dsymutil support for DW_OP_convert	Adrian Prantl	2019-02-28	1	-3/+37
\| \| \| \| \| \| \| \| \| \| \|	Add support for cloning DWARF expressions that contain base type DIE references in dsymutil. <rdar://problem/48167812> Differential Revision: https://reviews.llvm.org/D58534 llvm-svn: 355148
*	[AArch64] [Windows] Don't skip constructing UnwindHelp.	Eli Friedman	2019-02-28	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	In certain cases, the first non-frame-setup instruction in a function is a branch. For example, it could be a cbz on an argument. Make sure we correctly allocate the UnwindHelp, and find an appropriate register to use to initialize it. Fixes https://bugs.llvm.org/show_bug.cgi?id=40184 Differential Revision: https://reviews.llvm.org/D58752 llvm-svn: 355136
*	[AArch64] Improve FP16 vector convert from short instructions.	Abderrazek Zaafrani	2019-02-28	1	-6/+15
\| \| \| \| \| \|	https://reviews.llvm.org/D58563 llvm-svn: 355134
*	Add a module pass for order file instrumentation	Manman Ren	2019-02-28	6	-0/+225
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name. In this pass, we add three global variables: (1) an order file buffer: a circular buffer at its own llvm section. (2) a bitmap for each module: one byte for each function to say if the function is already executed. (3) a global index to the order file buffer. At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index. Differential Revision: https://reviews.llvm.org/D57463 llvm-svn: 355133
*	[PGO] Context sensitive PGO (part 2)	Rong Xu	2019-02-28	8	-20/+80
\| \| \| \| \| \| \| \| \| \| \|	Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131
*	[x86] scalarize extract element 0 of FP math	Sanjay Patel	2019-02-28	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is another step towards ensuring that we produce the optimal code for reductions, but there are other potential benefits as seen in the tests diffs: 1. Memory loads may get scalarized resulting in more efficient code. 2. Memory stores may get scalarized resulting in more efficient code. 3. Complex ops like fdiv/sqrt get scalarized which may be faster instructions depending on uarch. 4. Even simple ops like addss/subss/mulss/roundss may result in faster operation/less frequency throttling when scalarized depending on uarch. The TODO comment suggests 1 or more follow-ups for opcodes that can currently result in regressions. Differential Revision: https://reviews.llvm.org/D58282 llvm-svn: 355130
*	bpf: disassembler support for XADD under sub-register mode	Jiong Wang	2019-02-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Like the other load/store instructions, "w" register is preferred when disassembling BPF_STX \| BPF_W \| BPF_XADD. v1 -> v2: - Updated testcase insn-unit.s (Yonghong) Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 355127
*	bpf: enable sub-register code-gen for XADD	Jiong Wang	2019-02-28	2	-5/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support sub-register code-gen for XADD is like supporting any other Load and Store patterns. No new instruction is introduced. lock (u32 )(r1 + 0) += w2 has exactly the same underlying insn as: lock (u32 )(r1 + 0) += r2 BPF_W width modifier has guaranteed they behave the same at runtime. This patch merely teaches BPF back-end that BPF_W width modifier could work GPR32 register class and that's all needed for sub-register code-gen support for XADD. test/CodeGen/BPF/xadd.ll updated to include sub-register code-gen tests. A new testcase test/CodeGen/BPF/xadd_legal.ll is added to make sure the legal case could pass on all code-gen modes. It could also test dead Def check on GPR32. If there is no proper handling like what has been done inside BPFMIChecking.cpp:hasLivingDefs, then this testcase will fail. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 355126
*	bpf: improve dead Defs check for XADD	Jiong Wang	2019-02-28	1	-1/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BPF XADD semantics require all Defs of XADD are dead, meaning any result of XADD insn is not used. However, BPF backend hasn't enabled sub-register liveness track, so when the source and destination operands of XADD are GPR32, there is no sub-register dead info. If we rely on the generic MachineInstr::allDefsAreDead, then we will raise false alarm on GPR32 Def. This was fine as there was no sub-register code-gen support for XADD which will be added by the next patch. To support GPR32 Def, ideally we could just enable sub-registr liveness track on BPF backend, then allDefsAreDead could work on GPR32 Def. This requires implementing TargetSubtargetInfo::enableSubRegLiveness on BPF. However, sub-register liveness tracking module inside LLVM is actually designed for the situation where one register could be split into more than one sub-registers for which case each sub-register could have their own liveness and kill one of them doesn't kill others. So, tracking liveness for each make sense. For BPF, each 64-bit register could only have one 32-bit sub-register. This is exactly the case which LLVM think brings no benefits for doing sub-register tracking, because the live range of sub-register must always equal to its parent register, therefore liveness tracking is disabled even the back-end has implemented enableSubRegLiveness. The detailed information is at r232695: Author: Matthias Braun <matze@braunis.de> Date: Thu Mar 19 00:21:58 2015 +0000 Do not track subregister liveness when it brings no benefits Hence, for BPF, we enhance MachineInstr::allDefsAreDead. Given the solo sub-register always has the same liveness as its parent register, LLVM is already attaching a implicit 64-bit register Def whenever the there is a sub-register Def. The liveness of the implicit 64-bit Def is available. For example, for "lock (u32 )(r0 + 4) += w9", the MachineOperand info could be: $w9 = XADDW32 killed $r0, 4, $w9(tied-def 0), implicit killed $r9, implicit-def dead $r9 Even though w9 is not marked as Dead, the parent register r9 is marked as Dead correctly, and it is safe to use such information or our purpose. v1 -> v2: - Simplified code logic inside hasLiveDefs. (Yonghong) Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 355124
*	[InstCombine] fold adds of constants separated by sext/zext	Sanjay Patel	2019-02-28	1	-8/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is part of a transform that may be done in the backend: D13757 ...but it should always be beneficial to fold this sooner in IR for all targets. https://rise4fun.com/Alive/vaiW Name: sext add nsw %add = add nsw i8 %i, C0 %ext = sext i8 %add to i32 %r = add i32 %ext, C1 => %s = sext i8 %i to i32 %r = add i32 %s, sext(C0)+C1 Name: zext add nuw %add = add nuw i8 %i, C0 %ext = zext i8 %add to i16 %r = add i16 %ext, C1 => %s = zext i8 %i to i16 %r = add i16 %s, zext(C0)+C1 llvm-svn: 355118
*	[X86] Don't peek through bitcasts before checking ↵	Craig Topper	2019-02-28	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \|	ISD::isBuildVectorOfConstantSDNodes in combineTruncatedArithmetic We don't have any combines that can look through a bitcast to truncate a build vector of constants. So the truncate will stick around and give us something like this pattern (binop (trunc X), (trunc (bitcast (build_vector)))) which has two truncates in it. Which will be reversed by hoistLogicOpWithSameOpcodeHands in the generic DAG combiner. Thus causing an infinite loop. Even if we had a combine for (truncate (bitcast (build_vector))), I think it would need to be implemented in getNode otherwise DAG combiner visit ordering would probably still visit the binop first and reverse it. Or combineTruncatedArithmetic would need to do its own constant folding. Differential Revision: https://reviews.llvm.org/D58705 llvm-svn: 355116
*	Revert "[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1."	Amara Emerson	2019-02-28	1	-118/+26
\| \| \| \| \| \|	Seems to break some neon intrinsics tests. llvm-svn: 355115
*	[WebAssembly] Remove uses of ThreadModel	Thomas Lively	2019-02-28	3	-22/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the clang UI, replaces -mthread-model posix with -matomics as the source of truth on threading. In the backend, replaces -thread-model=posix with the atomics target feature, which is now collected on the WebAssemblyTargetMachine along with all other used features. These collected features will also be used to emit the target features section in the future. The default configuration for the backend is thread-model=posix and no atomics, which was previously an invalid configuration. This change makes the default valid because the thread model is ignored. A side effect of this change is that objects are never emitted with passive segments. It will instead be up to the linker to decide whether sections should be active or passive based on whether atomics are used in the final link. Reviewers: aheejin, sbc100, dschuff Subscribers: mehdi_amini, jgravelle-google, hiraditya, sunfish, steven_wu, dexonsmith, rupprecht, jfb, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D58742 llvm-svn: 355112
*	[ValueTracking] More accurate unsigned sub overflow detection	Nikita Popov	2019-02-28	1	-11/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Second part of D58593. Compute precise overflow conditions based on all known bits, rather than just the sign bits. Unsigned a - b overflows iff a < b, and we can determine whether this always/never happens based on the minimal and maximal values achievable for a and b subject to the known bits constraint. llvm-svn: 355109
*	Make MergeBlockIntoPredecessor conformant to the precondition of calling ↵	Chijun Sima	2019-02-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DTU.applyUpdates Summary: It is mentioned in the document of DTU that "It is illegal to submit any update that has already been submitted, i.e., you are supposed not to insert an existent edge or delete a nonexistent edge." It is dangerous to violet this rule because DomTree and PostDomTree occasionally crash on this scenario. This patch fixes `MergeBlockIntoPredecessor`, making it conformant to this precondition. Reviewers: kuhar, brzycki, chandlerc Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58444 llvm-svn: 355105
*	[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1.	Amara Emerson	2019-02-28	1	-26/+118
\| \| \| \| \| \| \| \| \|	This extends the existing support for shufflevector to handle cases like <2 x float>, which we can implement by concating the vectors and using a TBL1. Differential Revision: https://reviews.llvm.org/D58684 llvm-svn: 355104
*	[Target][ARM] Add a usage for SrcSz to unbreak build-bots without assertions	Kadir Cetinkaya	2019-02-28	1	-0/+1
\| \| \| \|	llvm-svn: 355101