bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[MS Demangler] Demangle data member pointers.	Zachary Turner	2018-07-26	1	-155/+191
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49630 llvm-svn: 338061
*	[AMDGPU] Fix VGPR spills where offset doesn't fit in 12 bits	Scott Linder	2018-07-26	1	-11/+16
\| \| \| \| \| \| \| \| \| \|	Scale the offset of VGPR spills by the wave size when it cannot fit in the 12-bit offset immediate field and so is added to the soffset SGPR. This accounts for hardware swizzling of scratch memory. Differential Revision: https://reviews.llvm.org/D49448 llvm-svn: 338060
*	[InstCombine] fold udiv with common factor from muls with nuw	Sanjay Patel	2018-07-26	1	-0/+15
\| \| \| \| \| \| \| \| \|	Unfortunately, sdiv isn't as simple because of UB due to overflow. This fold is mentioned in PR38239: https://bugs.llvm.org/show_bug.cgi?id=38239 llvm-svn: 338059
*	[RISCV] Add support for _interrupt attribute	Ana Pazos	2018-07-26	7	-3/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Save/restore only registers that are used. This includes Callee saved registers and Caller saved registers (arguments and temporaries) for integer and FP registers. - If there is a call in the interrupt handler, save/restore all Caller saved registers (arguments and temporaries) and all FP registers. - Emit special return instructions depending on "interrupt" attribute type. Based on initial patch by Zhaoshi Zheng. Reviewers: asb Reviewed By: asb Subscribers: rkruppe, the_o, MartinMosbeck, brucehoult, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D48411 llvm-svn: 338047
*	MacroFusion: Fix macro fusion with ExitSU failing in top-down scheduling	Matthias Braun	2018-07-26	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When fusing instructions A and B, we must add all predecessors of B as predecessors of A to avoid instructions getting scheduling in between. There is a special case involving ExitSU: Every other node must be scheduled before it by design and we don't need to make this explicit in the graph, however when fusing with a different node we need to schedule every othere node before the fused node too and we need to make this explicit now: This patch adds a dependency from the fused node to all roots in the graph. Differential Revision: https://reviews.llvm.org/D49830 llvm-svn: 338046
*	[DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle ule,ugt CondCodes.	Roman Lebedev	2018-07-26	1	-9/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A follow-up for D49266 / rL337166. At least one of these cases is more canonical, so we really do have to handle it. https://godbolt.org/g/pkzP3X https://rise4fun.com/Alive/pQyhZZ We won't get to these cases with I1 being -1, as that will be constant-folded to true or false. I'm also not sure we actually hit the 'ule' case, but i think the worst think that could happen is that being dead code. Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49497 llvm-svn: 338044
*	[DEBUGINFO, NVPTX] Emit correct debug information for local variables.	Alexey Bataev	2018-07-26	5	-2/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: NVPTX target dos not use register-based frame information. Instead it relies on the artificial local_depot that is used instead of the frame and the data for variables must be emitted relatively to this local_depot. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45963 llvm-svn: 338039
*	[DEBUGINFO, NVPTX] Set `DW_AT_frame_base` to `DW_OP_call_frame_cfa`.	Alexey Bataev	2018-07-26	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For NVPTX target the value of `DW_AT_frame_base` attribute must be set to `DW_OP_call_frame_cfa`. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45785 llvm-svn: 338036
*	Revert r338027 to pacify build bot	James Henderson	2018-07-26	1	-4/+4
\| \| \| \|	llvm-svn: 338035
*	[ADT] Replace std::isprint by llvm::isPrint.	Michael Kruse	2018-07-26	5	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The standard library functions ::isprint/std::isprint have platform- and locale-dependent behavior which makes LLVM's output less predictable. In particular, regression tests my fail depending on the implementation of these functions. Implement llvm::isPrint in StringExtras.h with a standard behavior and replace all uses of ::isprint/std::isprint by a call it llvm::isPrint. The function is inlined and does not look up language settings so it should perform better than the standard library's version. Such a replacement has already been done for isdigit, isalpha, isxdigit in r314883. gtest does the same in gtest-printers.cc using the following justification: // Returns true if c is a printable ASCII character. We test the // value of c directly instead of calling isprint(), which is buggy on // Windows Mobile. inline bool IsPrintableAscii(wchar_t c) { return 0x20 <= c && c <= 0x7E; } Similar issues have also been encountered by Julia: https://github.com/JuliaLang/julia/issues/7416 I noticed the problem myself when on Windows isprint('\t') started to evaluate to true (see https://stackoverflow.com/questions/51435249) and thus caused several unit tests to fail. The result of isprint doesn't seem to be well-defined even for ASCII characters. Therefore I suggest to replace isprint by a platform-independent version. Differential Revision: https://reviews.llvm.org/D49680 llvm-svn: 338034
*	[UnJ] Common some code. NFC	David Green	2018-07-26	1	-40/+54
\| \| \| \| \| \| \| \| \|	Create a processHeaderPhiOperands for analysing the instructions in the aft blocks that must be moved before the loop. Differential Revision: https://reviews.llvm.org/D49061 llvm-svn: 338033
*	dwarfgen: Add support for generating the debug_str_offsets section, take 3	Pavel Labath	2018-07-26	5	-27/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previous version of this patch failed on darwin targets because of different handling of cross-debug-section relocations. This fixes the tests to emit the DW_AT_str_offsets_base attribute correctly in both cases. Since doing this is a non-trivial amount of code, and I'm going to need it in more than one test, I've added a helper function to the dwarfgen DIE class to do it. Original commit message follows: The motivation for this is D49493, where we'd like to test details of debug_str_offsets behavior which is difficult to trigger from a traditional test. This adds the plubming necessary for dwarfgen to generate this section. The more interesting changes are: - I've moved emitStringOffsetsTableHeader function from DwarfFile to DwarfStringPool, so I can generate the section header more easily from the unit test. - added a new addAttribute overload taking an MCExpr*. This is used to generate the DW_AT_str_offsets_base, which links a compile unit to the offset table. I've also added a basic test for reading and writing DW_form_strx forms. Reviewers: dblaikie, JDevlieghere, probinson Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D49670 llvm-svn: 338031
*	Enable some pointer authentication instructions for aarch64 v8a targets	Luke Cheeseman	2018-07-26	1	-24/+27
\| \| \| \| \| \| \| \| \| \| \|	- Some of the v8.3 pointer authentication instruction inhabit the Hint space - These instructions can be assembled to hint instructions which act as NOP instructions prior to v8.3 - This patch permits using the hint instructions for all v8a targets - Also, correct the RETA{A,B} instructions to match the instruction attributes of RET (set isTerminator and isBarrier) Differential Revision: https://reviews.llvm.org/D49786 llvm-svn: 338029
*	Fix raw_fd_ostream::write_impl hang with large output	James Henderson	2018-07-26	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Windows when raw_fd_ostream::write_impl calls write, a 32 bit input is required for character count. As a variable with size_t is used for this argument on x64 integral demotion occurs. In the case of large files an infinite loop follows. See PR37926. This fix allows the output of files larger than previous int32 limit. Differential Revision: https://reviews.llvm.org/D48948 Patch by Owen Reynolds Reviewed by: zturner llvm-svn: 338027
*	[mips] Sign extend i32 return values on MIPS64	Stefan Maksimovic	2018-07-26	4	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \|	Override getTypeForExtReturn so that functions returning an i32 typed value have it sign extended on MIPS64. Also provide patterns to get rid of unneeded sign extensions for arithmetic instructions which implicitly sign extend their results. Differential Revision: https://reviews.llvm.org/D48374 llvm-svn: 338019
*	Revert "[COFF] Use comdat shared constants for MinGW as well"	Martin Storsjo	2018-07-26	3	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r337951. While that kind of shared constant generally works fine in a MinGW setting, it broke some cases of inline assembly that worked before: $ cat const-asm.c int MULH(int a, int b) { int rt, dummy; __asm__ ( "imull %3" :"=d"(rt), "=a"(dummy) :"a"(a), "rm"(b) ); return rt; } int func(int a) { return MULH(a, 1); } $ clang -target x86_64-win32-gnu -c const-asm.c -O2 const-asm.c:4:9: error: invalid variant '00000001' "imull %3" ^ <inline asm>:1:15: note: instantiated into assembly here imull __real@00000001(%rip) ^ A similar error is produced for i686 as well. The same test with a target of x86_64-win32-msvc or i686-win32-msvc works fine. llvm-svn: 338018
*	[x86/SLH] Extract the logic to trace predicate state through calls to	Chandler Carruth	2018-07-26	1	-19/+39
\| \| \| \| \| \| \| \| \| \|	a helper function with a nice overview comment. NFC. This is a preperatory refactoring to implementing another component of mitigation here that was descibed in the design document but hadn't been implemented yet. llvm-svn: 338016
*	[AArch64] Armv8.2-A: add the crypto extensions	Sjoerd Meijer	2018-07-26	4	-5/+203
\| \| \| \| \| \| \| \| \|	This adds MC support for the crypto instructions that were made optional extensions in Armv8.2-A (AArch64 only). Differential Revision: https://reviews.llvm.org/D49370 llvm-svn: 338010
*	[X86] Don't use CombineTo to skip adding new nodes to the DAGCombiner ↵	Craig Topper	2018-07-26	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \|	worklist in combineMul. I'm not sure if this was trying to avoid optimizing the new nodes further or what. Or maybe to prevent a cycle if something tried to reform the multiply? But I don't think its a reliable way to do that. If the user of the expanded multiply is visited by the DAGCombiner after this conversion happens, the DAGCombiner will check its operands, see that they haven't been visited by the DAGCombiner before and it will then add the first node to the worklist. This process will repeat until all the new nodes are visited. So this seems like an unreliable prevention at best. So this patch just returns the new nodes like any other combine. If this starts causing problems we can try to add target specific nodes or something to more directly prevent optimizations. Now that we handle the combine normally, we can combine any negates the mul expansion creates into their users since those will be visited now. llvm-svn: 338007
*	Revert r337981: it breaks the debuginfo-tests	Alex Lorenz	2018-07-26	3	-41/+51
\| \| \| \| \| \| \|	This commit caused a regression in the debuginfo-tests: FAIL: debuginfo-tests :: apple-accel.cpp (40748 of 46595) llvm-svn: 337997
*	[X86] Remove some unnecessary explicit calls to DCI.AddToWorkList.	Craig Topper	2018-07-26	1	-10/+0
\| \| \| \| \| \|	These calls were making sure some newly created nodes were added to worklist, but the DAGCombiner has internal support for ensuring it has visited all nodes. Any time it visits a node it ensures the operands have been queued to be visited as well. This means if we only need to return the last new node. The DAGCombiner will take care of adding its inputs thus walking backwards through all the new nodes. llvm-svn: 337996
*	[Support] Introduce createStringError helper function	Victor Leschuk	2018-07-26	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	The function in question is copy-pasted lots of times in DWARF-related classes. Thus it will make sense to place its implementation into the Support library. Reviewed by: lhames Differential Revision: https://reviews.llvm.org/D49824 llvm-svn: 337995
*	[GlobalISel] Fall back to SDISel for swifterror/swiftself attributes.	Amara Emerson	2018-07-26	2	-0/+18
\| \| \| \| \| \|	We don't currently support these, fall back until we do. llvm-svn: 337994
*	[DWARF v5] Don't report an error when the .debug_rnglists section is empty ↵	Wolfgang Pieb	2018-07-26	1	-16/+18
\| \| \| \| \| \| \| \| \| \| \|	or non-existent. Fixes PR38297. Reviewer: JDevlieghere Differential Revision: https://reviews.llvm.org/D49815 llvm-svn: 337993
*	[LoadStoreVectorizer] Use const reference	Fangrui Song	2018-07-26	1	-4/+6
\| \| \| \|	llvm-svn: 337992
*	RegUsageInfo: Cleanup; NFC	Matthias Braun	2018-07-26	4	-72/+69
\| \| \| \| \| \| \| \| \|	- Remove unnecessary anchor function - Remove unnecessary override of getAnalysisUsage - Use reference instead of pointers where things cannot be nullptr - Use ArrayRef instead of std::vector where possible llvm-svn: 337989
*	CodeGen.cpp: Sort initializers; NFC	Matthias Braun	2018-07-26	1	-2/+2
\| \| \| \|	llvm-svn: 337988
*	CodeGen: Cleanup regmask construction; NFC	Matthias Braun	2018-07-26	5	-11/+15
\| \| \| \| \| \| \| \| \|	- Avoid duplication of regmask size calculation. - Simplify allocateRegisterMask() call. - Rename allocateRegisterMask() to allocateRegMask() to be consistent with naming in MachineOperand. llvm-svn: 337986
*	[DWARF v5] Don't emit multiple DW_AT_rnglists_base attributes. Some ↵	Wolfgang Pieb	2018-07-25	3	-51/+41
\| \| \| \| \| \| \| \| \| \| \| \|	refactoring of range lists emissions and added test cases. Reviewer: dblaikie Differential Revision: https://reviews.llvm.org/D49522 llvm-svn: 337981
*	bpf: new option -bpf-expand-memcpy-in-order to expand memcpy in order	Yonghong Song	2018-07-25	9	-8/+255
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some BPF JIT backends would want to optimize memcpy in their own architecture specific way. However, at the moment, there is no way for JIT backends to see memcpy semantics in a reliable way. This is due to LLVM BPF backend is expanding memcpy into load/store sequences and could possibly schedule them apart from each other further. So, BPF JIT backends inside kernel can't reliably recognize memcpy semantics by peephole BPF sequence. This patch introduce new intrinsic expand infrastructure to memcpy. To get stable in-order load/store sequence from memcpy, we first lower memcpy into BPF::MEMCPY node which then expanded into in-order load/store sequences in expandPostRAPseudo pass which will happen after instruction scheduling. By this way, kernel JIT backends could reliably recognize memcpy through scanning BPF sequence. This new memcpy expand infrastructure is gated by a new option: -bpf-expand-memcpy-in-order Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 337977
*	[GlobalMerge] Handle llvm.compiler.used correctly.	Eli Friedman	2018-07-25	1	-4/+5
\| \| \| \| \| \| \| \|	Reuse the handling for llvm.used, and don't transform such globals. Fixes a failure on the asan buildbot caused by my previous commit. llvm-svn: 337973
*	[SelectionDAG] try to convert funnel shift directly to rotate if legal	Sanjay Patel	2018-07-25	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \|	If the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. This sidesteps the issue of custom lowering for rotates raised in PR38243: https://bugs.llvm.org/show_bug.cgi?id=38243 ...by only dealing with legal operations. llvm-svn: 337966
*	[LSV] Look through selects for consecutive addresses	Roman Tereshin	2018-07-25	1	-15/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases LSV sees (load/store _ (select _ <pointer expression> <pointer expression>)) patterns in input IR, often due to sinking and other forms of CFG simplification, sometimes interspersed with bitcasts and all-constant-indices GEPs. With this patch`areConsecutivePointers` method would attempt to handle select instructions. This leads to an increased number of successful vectorizations. Technically, select instructions could appear in index arithmetic as well, however, we don't see those in our test suites / benchmarks. Also, there is a lot more freedom in IR shapes computing integral indices in general than in what's common in pointer computations, and it appears that it's quite unreliable to do anything short of making select instructions first class citizens of Scalar Evolution, which for the purposes of this patch is most definitely an overkill. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D49428 llvm-svn: 337965
*	[GlobalMerge] Allow merging globals with arbitrary alignment.	Eli Friedman	2018-07-25	1	-18/+26
\| \| \| \| \| \| \| \| \|	Instead of depending on implicit padding from the structure layout code, use a packed struct and emit the padding explicitly. Differential Revision: https://reviews.llvm.org/D49710 llvm-svn: 337961
*	Revert r337904: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵	Florian Hahn	2018-07-25	2	-140/+10
\| \| \| \| \| \| \| \|	instructions. I suspect it is causing the clang-stage2-Rthinlto failures. llvm-svn: 337956
*	Add missing 'override', fixing compilation with some compilers since SVN r337950	Martin Storsjo	2018-07-25	1	-1/+1
\| \| \| \|	llvm-svn: 337952
*	[COFF] Use comdat shared constants for MinGW as well	Martin Storsjo	2018-07-25	3	-15/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GNU binutils tools have no problems with this kind of shared constants, provided that we actually hook it up completely in AsmPrinter and produce a global symbol. This effectively reverts SVN r335918 by hooking the rest of it up properly. This feature was implemented originally in SVN r213006, with no reason for why it can't be used for MinGW other than the fact that GCC doesn't do it while MSVC does. Differential Revision: https://reviews.llvm.org/D49646 llvm-svn: 337951
*	[COFF] Hoist constant pool handling from X86AsmPrinter into AsmPrinter	Martin Storsjo	2018-07-25	6	-30/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In SVN r334523, the first half of comdat constant pool handling was hoisted from X86WindowsTargetObjectFile (which despite the name only was used for msvc targets) into the arch independent TargetLoweringObjectFileCOFF, but the other half of the handling was left behind in X86AsmPrinter::GetCPISymbol. With only half of the handling in place, inconsistent comdat sections/symbols are created, causing issues with both GNU binutils (avoided for X86 in SVN r335918) and with the MS linker, which would complain like this: fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x4 Differential Revision: https://reviews.llvm.org/D49644 llvm-svn: 337950
*	[ARM] Prefer lsls+lsrs over lsls+ands or lsrs+ands in Thumb1.	Eli Friedman	2018-07-25	1	-0/+81
\| \| \| \| \| \| \| \| \| \| \| \| \|	Saves materializing the immediate for the "ands". Corresponding patterns exist for lsrs+lsls, but that seems less common in practice. Now implemented as a DAGCombine. Differential Revision: https://reviews.llvm.org/D49585 llvm-svn: 337945
*	[SCEV] Add [zs]ext{C,+,x} -> (D + [zs]ext{C-D,+,x})<nuw><nsw> transform	Roman Tereshin	2018-07-25	1	-63/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	as well as sext(C + x + ...) -> (D + sext(C-D + x + ...))<nuw><nsw> similar to the equivalent transformation for zext's if the top level addition in (D + (C-D + x * n)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x * n), ensuring homogeneous behaviour of the transformation and better canonicalization of such AddRec's (indeed, there are 2^(2w) different expressions in `B1 + ext(B2 + Y)` form for the same Y, but only 2^(2w - k) different expressions in the resulting `B3 + ext((B4 * 2^k) + Y)` form, where w is the bit width of the integral type) This patch generalizes sext(C1 + C2X) --> sext(C1) + sext(C2X) and sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformations added in r209568 relaxing the requirements the following way: 1. C2 doesn't have to be a power of 2, it's enough if it's divisible by 2 a sufficient number of times; 2. C1 doesn't have to be less than C2, instead of extracting the entire C1 we can split it into 2 terms: (00...0XXX + YY...Y000), keep the second one that may cause wrapping within the extension operator, and move the first one that doesn't affect wrapping out of the extension operator, enabling further simplifications; 3. C1 and C2 don't have to be positive, splitting C1 like shown above produces a sum that is guaranteed to not wrap, signed or unsigned; 4. in AddExpr case there could be more than 2 terms, and in case of AddExpr the 2nd and following terms and in case of AddRecExpr the Step component don't have to be in the C2X form or constant (respectively), they just need to have enough trailing zeros, which in turn could be guaranteed by means other than arithmetics, e.g. by a pointer alignment; 5. the extension operator doesn't have to be a sext, the same transformation works and profitable for zext's as well. Apparently, optimizations like SLPVectorizer currently fail to vectorize even rather trivial cases like the following: double bar(double a, unsigned n) { double x = 0.0; double y = 0.0; for (unsigned i = 0; i < n; i += 2) { x += a[i]; y += a[i + 1]; } return x * y; } If compiled with `clang -std=c11 -Wpedantic -Wall -O3 main.c -S -o - -emit-llvm` (!{!"clang version 7.0.0 (trunk 337339) (llvm/trunk 337344)"}) it produces scalar code with the loop not unrolled with the unsigned `n` and `i` (like shown above), but vectorized and unrolled loop with signed `n` and `i`. With the changes made in this commit the unsigned version will be vectorized (though not unrolled for unclear reasons). How it all works: Let say we have an AddExpr that looks like (C + x + y + ...), where C is a constant and x, y, ... are arbitrary SCEVs. Let's compute the minimum number of trailing zeroes guaranteed of that sum w/o the constant term: (x + y + ...). If, for example, those terms look like follows: i XXXX...X000 YYYY...YY00 ... ZZZZ...0000 then the rightmost non-guaranteed-zero bit (a potential one at i-th position above) can change the bits of the sum to the left (and at i-th position itself), but it can not possibly change the bits to the right. So we can compute the number of trailing zeroes by taking a minimum between the numbers of trailing zeroes of the terms. Now let's say that our original sum with the constant is effectively just C + X, where X = x + y + .... Let's also say that we've got 2 guaranteed trailing zeros for X: j CCCC...CCCC XXXX...XX00 // this is X = (x + y + ...) Any bit of C to the left of j may in the end cause the C + X sum to wrap, but the rightmost 2 bits of C (at positions j and j - 1) do not affect wrapping in any way. If the upper bits cause a wrap, it will be a wrap regardless of the values of the 2 least significant bits of C. If the upper bits do not cause a wrap, it won't be a wrap regardless of the values of the 2 bits on the right (again). So let's split C to 2 constants like follows: 0000...00CC = D CCCC...CC00 = (C - D) and represent the whole sum as D + (C - D + X). The second term of this new sum looks like this: CCCC...CC00 XXXX...XX00 ----------- // let's add them up YYYY...YY00 The sum above (let's call it Y)) may or may not wrap, we don't know, so we need to keep it under a sext/zext. Adding D to that sum though will never wrap, signed or unsigned, if performed on the original bit width or the extended one, because all that that final add does is setting the 2 least significant bits of Y to the bits of D: YYYY...YY00 = Y 0000...00CC = D ----------- <nuw><nsw> YYYY...YYCC Which means we can safely move that D out of the sext or zext and claim that the top-level sum neither sign wraps nor unsigned wraps. Let's run an example, let's say we're working in i8's and the original expression (zext's or sext's operand) is 21 + 12x + 8y. So it goes like this: 0001 0101 // 21 XXXX XX00 // 12x YYYY Y000 // 8y 0001 0101 // 21 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D 0001 0100 // 21 - D = 20 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D WWWW WW00 // 21 - D + 12x + 8y = 20 + 12x + 8y therefore zext(21 + 12x + 8y) = (1 + zext(20 + 12x + 8y))<nuw><nsw> This approach could be improved if we move away from using trailing zeroes and use KnownBits instead. For instance, with KnownBits we could have the following picture: i 10 1110...0011 // this is C XX X1XX...XX00 // this is X = (x + y + ...) Notice that some of the bits of X are known ones, also notice that known bits of X are interspersed with unknown bits and not grouped on the rigth or left. We can see at the position i that C(i) and X(i) are both known ones, therefore the (i + 1)th carry bit is guaranteed to be 1 regardless of the bits of C to the right of i. For instance, the C(i - 1) bit only affects the bits of the sum at positions i - 1 and i, and does not influence if the sum is going to wrap or not. Therefore we could split the constant C the following way: i 00 0010...0011 = D 10 1100...0000 = (C - D) Let's compute the KnownBits of (C - D) + X: XX1 1 = carry bit, blanks stand for known zeroes 10 1100...0000 = (C - D) XX X1XX...XX00 = X --- ----------- XX X0XX...XX00 Will this add wrap or not essentially depends on bits of X. Adding D to this sum, however, is guaranteed to not to wrap: 0 X 00 0010...0011 = D sX X0XX...XX00 = (C - D) + X --- ----------- sX XXXX XX11 As could be seen above, adding D preserves the sign bit of (C - D) + X, if any, and has a guaranteed 0 carry out, as expected. The more bits of (C - D) we constrain, the better the transformations introduced here canonicalize expressions as it leaves less freedom to what values the constant part of ((C - D) + x + y + ...) can take. Reviewed By: mzolotukhin, efriedma Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337943
*	Add an option to specify the name of	Xinliang David Li	2018-07-25	1	-0/+11
\| \| \| \| \| \| \| \|	an function whose CFG is to be viewed/printed. Differential Revision: https://reviews.llvm.org/D49447 llvm-svn: 337940
*	Fix corruption of result number in LegalizeVectorOps.cpp	Ulrich Weigand	2018-07-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	When VectorLegalizer::LegalizeOp creates a new SDValue after iterating over its arguments, we need to refer to the same result number of the new node that the original value used. Reviewed by: cameron.mcinally Differential Revision: https://reviews.llvm.org/D49805 llvm-svn: 337939
*	[AMDGPU] Use AssumptionCacheTracker in the divrem32 expansion	Stanislav Mekhanoshin	2018-07-25	1	-13/+21
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49761 llvm-svn: 337938
*	Fix llvm::ComputeNumSignBits with some operations and llvm.assume	Stanislav Mekhanoshin	2018-07-25	1	-7/+7
\| \| \| \| \| \| \| \| \| \|	Currently ComputeNumSignBits does early exit while processing some of the operations (add, sub, mul, and select). This prevents the function from using AssumptionCacheTracker if passed. Differential Revision: https://reviews.llvm.org/D49759 llvm-svn: 337936
*	Revert "dwarfgen: Add support for generating the debug_str_offsets section, ↵	Pavel Labath	2018-07-25	5	-32/+27
\| \| \| \| \| \| \| \| \|	take 2" This reverts commit r337933. The build error is fixed but the test now fails on the darwin buildbots. Investigating... llvm-svn: 337935
*	[Hexagon] Properly scale bit index when extracting elements from vNi1	Krzysztof Parzyszek	2018-07-25	1	-1/+3
\| \| \| \| \| \| \| \|	For example v = <2 x i1> is represented as bbbbaaaa in a predicate register, where b = v[1], a = v[0]. Extracting v[1] is equivalent to extracting bit 4 from the predicate register. llvm-svn: 337934
*	dwarfgen: Add support for generating the debug_str_offsets section, take 2	Pavel Labath	2018-07-25	5	-27/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This recommits r337910 after fixing an "ambiguous call to addAttribute" error with some compilers (gcc circa 4.9 and MSVC). It seems that these compilers will consider a "false -> pointer" conversion during overload resolution. This creates ambiguity because one I added an overload which takes a MCExpr * as an argument. I fix this by making the new overload take MCExpr&, which avoids the conversion. It also documents the fact that we expect a valid MCExpr object. Original commit message follows: The motivation for this is D49493, where we'd like to test details of debug_str_offsets behavior which is difficult to trigger from a traditional test. This adds the plubming necessary for dwarfgen to generate this section. The more interesting changes are: - I've moved emitStringOffsetsTableHeader function from DwarfFile to DwarfStringPool, so I can generate the section header more easily from the unit test. - added a new addAttribute overload taking an MCExpr*. This is used to generate the DW_AT_str_offsets_base, which links a compile unit to the offset table. I've also added a basic test for reading and writing DW_form_strx forms. Reviewers: dblaikie, JDevlieghere, probinson Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D49670 llvm-svn: 337933
*	Move JIT listener C binding fallbackks to ExecutionEngineBindings.cpp.	Andres Freund	2018-07-25	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initially, in https://reviews.llvm.org/D44890, I had these defined as empty functions inside the header when the respective event listener was not built in. As done in that commit, that wasn't correct, because it was a ODR violation. Krasimir hot-fixed that in r333265, but that wasn't quite right either, because it'd lead to the symbol not being available. Instead just move the fallbacksto ExecutionEngineBindings.cpp. Could define them as static inlines in the header too, but I don't think it matters. Reviewers: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49654 llvm-svn: 337930
*	Revert "dwarfgen: Add support for generating the debug_str_offsets section"	Pavel Labath	2018-07-25	5	-32/+27
\| \| \| \| \| \| \| \| \|	This reverts commit r337910 as it's generating "ambiguous call to addAttribute" errors on some bots. Will resubmit once I get a chance to look into the problem. llvm-svn: 337924
*	[MIPS GlobalISel] Lower pointer arguments	Petar Jovanovic	2018-07-25	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for lowering pointer arguments. Changing type from pointer to integer is already done in MipsTargetLowering::getRegisterTypeForCallingConv. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D49419 llvm-svn: 337912