bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	add a missed case for binary op FMF propagation under select folds	Michael Berg	2018-08-16	1	-1/+3
\| \| \| \|	llvm-svn: 339938
*	[MemLoc] Fix a bug causing any use of invariant.end to crash in LICM	Philip Reames	2018-08-16	1	-0/+4
\| \| \| \| \| \|	The fix is fairly simple, but is says something unpleasant about the usage and testing of invariant.start/end scopes that this went undetected. To put this in perspective, any invariant.end in a loop flowing through LICM crashed. I haven't bothered to figure out just how far back this goes, but it's not caused by any of the recent changes. We're probably talking months if not years. llvm-svn: 339936
*	[LICM][NFC] Restructure pointer invalidation API in terms of MemoryLocation	Philip Reames	2018-08-16	2	-25/+17
\| \| \| \| \| \| \| \|	Main value is just simplifying code. I'll further simply the argument handling case in a bit, but that involved a slightly orthogonal change so I went with the mildy ugly intermediate for this patch. Note that the isSized check in the old LICM code was not carried across. It turns out that check was dead. a) no test exercised it, and b) langref and verifier had been updated to disallow unsized types used in loads. llvm-svn: 339930
*	[WebAssembly] Remove temporary workaround for function bitcasts	Jacob Gravelle	2018-08-16	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: EM_ASM no longer is lowered as varargs in C, so this workaround is obsolete. Reviewers: dschuff, sunfish Subscribers: sbc100, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D50859 llvm-svn: 339925
*	[MachineVerifier] Check if predecessor is jointly dominated by undefs	Krzysztof Parzyszek	2018-08-16	1	-1/+11
\| \| \| \| \| \| \| \|	Each use of a value should be jointly dominated by the union of defs and undefs. It can happen that it will only be jointly dominated by undefs, and that is still legal. Make sure that the verifier is aware of that. llvm-svn: 339924
*	[SelectionDAG] Improve the legalisation lowering of UMULO.	Eli Friedman	2018-08-16	1	-17/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no way in the universe, that doing a full-width division in software will be faster than doing overflowing multiplication in software in the first place, especially given that this same full-width multiplication needs to be done anyway. This patch replaces the previous implementation with a direct lowering into an overflowing multiplication algorithm based on half-width operations. Correctness of the algorithm was verified by exhaustively checking the output of this algorithm for overflowing multiplication of 16 bit integers against an obviously correct widening multiplication. Baring any oversights introduced by porting the algorithm to DAG, confidence in correctness of this algorithm is extremely high. Following table shows the change in both t = runtime and s = space. The change is expressed as a multiplier of original, so anything under 1 is “better” and anything above 1 is worse. +-------+-----------+-----------+-------------+-------------+ \| Arch \| u64u64 t \| u64u64 s \| u128u128 t \| u128u128 s \| +-------+-----------+-----------+-------------+-------------+ \| X64 \| - \| - \| ~0.5 \| ~0.64 \| \| i686 \| ~0.5 \| ~0.6666 \| ~0.05 \| ~0.9 \| \| armv7 \| - \| ~0.75 \| - \| ~1.4 \| +-------+-----------+-----------+-------------+-------------+ Performance numbers have been collected by running overflowing multiplication in a loop under `perf` on two x86_64 (one Intel Haswell, other AMD Ryzen) based machines. Size numbers have been collected by looking at the size of function containing an overflowing multiply in a loop. All in all, it can be seen that both performance and size has improved except in the case of armv7 where code size has regressed for 128-bit multiply. u128*u128 overflowing multiply on 32-bit platforms seem to benefit from this change a lot, taking only 5% of the time compared to original algorithm to calculate the same thing. The final benefit of this change is that LLVM is now capable of lowering the overflowing unsigned multiply for integers of any bit-width as long as the target is capable of lowering regular multiplication for the same bit-width. Previously, 128-bit overflowing multiply was the widest possible. Patch by Simonas Kazlauskas! Differential Revision: https://reviews.llvm.org/D50310 llvm-svn: 339922
*	[RegisterCoalescer] Shrink to uses if needed after removeCopyByCommutingDef	Krzysztof Parzyszek	2018-08-16	1	-24/+54
\| \| \| \|	llvm-svn: 339912
*	Fix memory leak in demangling of string literals.	Zachary Turner	2018-08-16	1	-0/+1
\| \| \| \|	llvm-svn: 339909
*	[TargetLowering] Add support for non-uniform vectors to BuildSDIV	Simon Pilgrim	2018-08-16	1	-10/+24
\| \| \| \| \| \| \| \| \| \|	This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators. This is the last patch necessary to close PR36545 Differential Revision: https://reviews.llvm.org/D50765 llvm-svn: 339908
*	[codeview] Use push_macro to avoid conflicts instead of a prefix	Reid Kleckner	2018-08-16	2	-172/+172
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This prefix was added in r333421, and it changed our dumper output to say things like "CVRegEAX" instead of just "EAX". That's a functional change that I'd rather avoid. I tested GCC, Clang, and MSVC, and all of them support #pragma push_macro. They don't issue warnings whem the macro is not defined either. I don't have a Mac so I can't test the real termios.h header, but I looked at the termios.h sources online and looked for other conflicts. I saw only the CR* macros, so those are the ones we work around. Reviewers: zturner, JDevlieghere Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50851 llvm-svn: 339907
*	[MC] Cleanup noop default case spelling. NFC.	Nirav Dave	2018-08-16	1	-1/+1
\| \| \| \|	llvm-svn: 339906
*	AMDGPU: Custom lower fexp	Matt Arsenault	2018-08-16	3	-1/+36
\| \| \| \| \| \| \| \|	This will allow the library to just use __builtin_expf directly without expanding this itself. Note f64 still won't work because there is no exp instruction for it. llvm-svn: 339902
*	[TargetLowering] Refactor BuildSDIV in preparation for D50765. NFCI.	Simon Pilgrim	2018-08-16	1	-24/+36
\| \| \| \| \| \|	Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator. llvm-svn: 339898
*	[MC] Remove unused variable	Benjamin Kramer	2018-08-16	1	-1/+0
\| \| \| \|	llvm-svn: 339896
*	[MC][X86] Enhance X86 Register expression handling to more closely match GCC.	Nirav Dave	2018-08-16	4	-21/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow the comparison of x86 registers in the evaluation of assembler directives. This generalizes and simplifies the extension from r334022 to catch another case found in the Linux kernel. Reviewers: rnk, void Reviewed By: rnk Subscribers: hiraditya, nickdesaulniers, llvm-commits Differential Revision: https://reviews.llvm.org/D50795 llvm-svn: 339895
*	Fix -Wmicrosoft-goto warnings.	Zachary Turner	2018-08-16	1	-7/+13
\| \| \| \|	llvm-svn: 339894
*	Add support for AVX-512 CodeView registers.	Zachary Turner	2018-08-16	1	-114/+170
\| \| \| \| \| \| \| \| \| \| \|	When compiling with /arch:AVX512 and optimizations turned on, we could crash while emitting debug info because we did not have CodeView register constants for the AVX 512 register set defined. This patch defines them. Differential Revision: https://reviews.llvm.org/D50819 llvm-svn: 339893
*	[MS Demangler] Demangle string literals.	Zachary Turner	2018-08-16	1	-3/+398
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When demangling string literals, Microsoft's undname simply prints 'string'. This patch implements string literal demangling while doing a bit better than this by decoding as much of the string as possible and trying to faithfully reproduce the original string literal definition. This is a bit tricky because the different character types char, char16_t, and char32_t are not uniquely identified by the mangling, so we have to use a heuristic to try to guess the character type. But it works pretty well, and many tests are added to illustrate the behavior. Differential Revision: https://reviews.llvm.org/D50806 llvm-svn: 339892
*	[MS Demangler] Don't fail on MD5-mangled names.	Zachary Turner	2018-08-16	1	-1/+14
\| \| \| \| \| \| \| \|	When we have an MD5 mangled name, we shouldn't choke and say that it's an invalid name. Even though it's impossible to demangle, we should just output the original name. llvm-svn: 339891
*	[InstCombine] Expand the simplification of pow(x, 0.5) to sqrt(x)	Evandro Menezes	2018-08-16	1	-31/+20
\| \| \| \| \| \| \| \| \|	Expand the number of cases when `pow(x, 0.5)` is simplified into `sqrt(x)` by considering the math semantics with more granularity. Differential revision: https://reviews.llvm.org/D50036 llvm-svn: 339887
*	[InstCombine] move vector compare before same-shuffled ops	Sanjay Patel	2018-08-16	1	-0/+28
\| \| \| \| \| \| \|	This is a step towards fixing PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 llvm-svn: 339875
*	[ARM] Ignore GEPs in ARMCodeGenPrepare	Sam Parker	2018-08-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	While searching through the use-def tree, ignore GetElementPtrInst instructions because they don't need promoting and neither do their indices. Otherwise, the wide indices prevent the transformation from happening. Differential Revision: https://reviews.llvm.org/D50762 llvm-svn: 339871
*	[ARM] Allow zext in ARMCodeGenPrepare	Sam Parker	2018-08-16	1	-3/+8
\| \| \| \| \| \| \| \|	Treat zext instructions as roots, like we do for truncs. Differential Revision: https://reviews.llvm.org/D50759 llvm-svn: 339868
*	[RISCV][MC] Don't fold symbol differences if ↵	Alex Bradbury	2018-08-16	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	requiresDiffExpressionRelocations is true When emitting the difference between two symbols, the standard behavior is that the difference will be resolved to an absolute value if both of the symbols are offsets from the same data fragment. This is undesirable on architectures such as RISC-V where relaxation in the linker may cause the computed difference to become invalid. This caused an issue when compiling to object code, where the size of a function in the debug information was already calculated even though it could change as a consequence of relaxation in the subsequent linking stage. This patch inhibits the resolution of symbol differences to absolute values where the target's AsmBackend has declared that it does not want these to be folded. Differential Revision: https://reviews.llvm.org/D45773 Patch by Edward Jones. llvm-svn: 339864
*	[ADT] Replace APInt::WORD_MAX with APInt::WORDTYPE_MAX	Simon Pilgrim	2018-08-16	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	The windows SDK defines WORD_MAX, so any poor soul that wants to use LLVM in a project that depends on the windows SDK gets a build error. Given that it actually describes the maximal value of WordType, it actually fits even better than WORD_MAX Patch by: @miscco Differential Revision: https://reviews.llvm.org/D50777 llvm-svn: 339863
*	[ARM] Allow signed icmps in ARMCodeGenPrepare	Sam Parker	2018-08-16	1	-30/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally committed in r339755 which was reverted in r339806 due to an asan issue. The issue was caused by my assumption that operands to a CallInst mapped to the FunctionType Params. CallInsts are now handled by iterating over their ArgOperands instead of Operands. Original Message: Treat signed icmps as 'sinks', allowing them to be in the use-def tree, enabling more promotions to be performed. As a sink, any promoted incoming values need to be truncated before being used by the signed icmp. Differential Revision: https://reviews.llvm.org/D50067 llvm-svn: 339858
*	[mips] Remove dead code from MipsPassConfig	Simon Atanasyan	2018-08-16	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	Found by GCC's -Wunused-function. Patch by Kim Gräsman. Differential revision: https://reviews.llvm.org/D50612 llvm-svn: 339847
*	[NFC] Remove const modifier to allow further development in LICM	Max Kazantsev	2018-08-16	1	-3/+2
\| \| \| \|	llvm-svn: 339846
*	[NFC] Add missing const modifier	Max Kazantsev	2018-08-16	1	-1/+1
\| \| \| \|	llvm-svn: 339844
*	[X86] Remove masking from the 512-bit padds and psubs intrinsics. Use select ↵	Craig Topper	2018-08-16	3	-45/+27
\| \| \| \| \| \|	in IR instead. llvm-svn: 339842
*	[X86] Remove the unused masked 128 and 256-bit masked padds/psubs intrinsics.	Craig Topper	2018-08-16	2	-20/+42
\| \| \| \| \| \|	Still need to remove masking from the 512-bit versions. llvm-svn: 339841
*	[x86] Actually initialize the SLH pass with the x86 backend and use	Chandler Carruth	2018-08-16	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a shorter name ('x86-slh') for the internal flags and pass name. Without this, you can't use the -stop-after or -stop-before infrastructure. I seem to have just missed this when originally adding the pass. The shorter name solves two problems. First, the flag names were ... really long and hard to type/manage. Second, the pass name can't be the exact same as the flag name used to enable this, and there are already some users of that flag name so I'm avoiding changing it unnecessarily. llvm-svn: 339836
*	[BFI] Use rounding while computing profile counts.	Easwaran Raman	2018-08-16	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Profile count of a block is computed by multiplying its block frequency by entry count and dividing the result by entry block frequency. Do rounded division in the last step and update test cases appropriately. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50822 llvm-svn: 339835
*	[Metadata] Replace a SmallVector with an array; NFC	George Burgess IV	2018-08-15	1	-3/+4
\| \| \| \| \| \|	MDNode::get takes an ArrayRef, so these should be equivalent. llvm-svn: 339824
*	[CodeGenPrepare] Add BothExtension type to PromotedInsts	Guozhi Wei	2018-08-15	1	-7/+49
\| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes PR38125. Instruction extension types are recorded in PromotedInsts, it can be used later in function canGetThrough. If an instruction has two users with different extension types, it will be inserted into PromotedInsts two times in function promoteOperandForOther. The second one overwrites the first one, and the final extension type is wrong, later causes problem in canGetThrough. This patch changes the simple bool extension type to 2-bit enum type, add a BothExtension type in addition to zero/sign extension. When an user sees BothExtension for an instruction, it actually knows nothing about how that instruction is extended. Differential Revision: https://reviews.llvm.org/D49512 llvm-svn: 339822
*	AMDGPU: Fold fneg into fmed3	Matt Arsenault	2018-08-15	1	-0/+11
\| \| \| \|	llvm-svn: 339821
*	AMDGPU: Improve extract_vector_elt reduction combine	Matt Arsenault	2018-08-15	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \|	Handle fmul, fsub and preserve flags. Also really test minnum/maxnum reductions. The existing tests were only checking from minnum/maxnum matched from a fast math compare and select which is not the same. llvm-svn: 339820
*	AMDGPU: Implement llvm.amdgcn.icmp/fcmp for i16/f16	Matt Arsenault	2018-08-15	2	-26/+93
\| \| \| \| \| \| \|	Also support these on targets without support for these, since it will allow us to freely create these in instcombine. llvm-svn: 339819
*	[X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half ↵	Craig Topper	2018-08-15	1	-50/+133
\| \| \| \| \| \| \| \| \| \| \| \|	comes from V1 and the high half comes from V2 and the halves do the same operation To lower this we now create a new V1 containing the low half of both sources and a new V2 containing the upper half of both sources. Then we created a repeated lane shuffle of those new sources to create the final result. This fixes PR35833 Differential Revison: https://reviews.llvm.org/D41794 llvm-svn: 339818
*	AMDGPU: Stop producing icmp/fcmp intrinsics with invalid types	Matt Arsenault	2018-08-15	1	-0/+27
\| \| \| \|	llvm-svn: 339815
*	AMDGPU: Address todo for handling 1/(2 pi)	Matt Arsenault	2018-08-15	4	-11/+31
\| \| \| \|	llvm-svn: 339814
*	DAG: Use getObjectOffset helper	Matt Arsenault	2018-08-15	1	-4/+1
\| \| \| \|	llvm-svn: 339813
*	DAG: Try to custom lower when promoting float operands	Matt Arsenault	2018-08-15	1	-0/+5
\| \| \| \| \| \| \|	For some reason this wasn't done for floats like integers. llvm-svn: 339811
*	[MCJIT] Fix a case of Error::success() being passed to report_fatal_error.	Lang Hames	2018-08-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	MCJIT::getSymbolAddress was handling a non-fatal error condition of JITSymbol as fatal. JITSymbol::operator bool returns false if no address is available but no error is set. This can occur e.g. if the symbol name was not found. Patch by Jascha Wetzel. Thanks Jascha! llvm-svn: 339809
*	Revert "[ARM] Allow signed icmps in ARMCodeGenPrepare"	Vitaly Buka	2018-08-15	1	-44/+22
\| \| \| \| \| \| \| \|	use-after-poison in check-llvm under asan This reverts commit r339755. llvm-svn: 339806
*	[Support] Add a basic C API for llvm::Error.	Lang Hames	2018-08-15	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The C-API supports consuming errors, converting an error to a string error message, and querying an error's type. Other LLVM C APIs that wish to use llvm::Error can supply error-type-id checkers and custom error-to-structured-type converters for any custom errors they provide. Reviewers: bogner, zturner, labath, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50716 llvm-svn: 339802
*	[WebAssembly][NFC] Standardize SIMD multiclass format	Thomas Lively	2018-08-15	1	-36/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This CL changes the ExtractLane ISEL multiclass to more closely mirror the structure of the splat and replace_lane multiclasses. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D50794 llvm-svn: 339801
*	llvm-readobj: Fix addend in relocations for android packed format	Peter Collingbourne	2018-08-15	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \|	If a relocation group doesn't have the RELOCATION_GROUP_HAS_ADDEND_FLAG set, then this implies the group's addend equals zero. In this case android packed format won't encode an explicit addend delta, instead we need to set Addend, the "previous addend" variable, to zero by ourself. Patch by Yi-Yo Chiang! Differential Revision: https://reviews.llvm.org/D50601 llvm-svn: 339799
*	[WebAssembly] Test commit	Thomas Lively	2018-08-15	1	-2/+1
\| \| \| \| \| \|	Changes a comment and some whitespace to test commit access. llvm-svn: 339798
*	[InstCombine] Fix IC trying to create a xor of pointer types.	Amara Emerson	2018-08-15	1	-1/+2
\| \| \| \| \| \| \| \|	rdar://42473741 Differential Revision: https://reviews.llvm.org/D50775 llvm-svn: 339796