bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Resubmit "[Support] Expose flattenWindowsCommandLine."	Zachary Turner	2018-06-10	4	-122/+86
\| \| \| \| \| \| \| \| \|	There were a few linux compilation failures, but other than that I think this was just a flake that caused the tests to fail. I'm going to resubmit and see if the failures go away, if not I'll revert again. llvm-svn: 334355
*	Revert "[Support] Expose flattenWindowsCommandLine."	Zachary Turner	2018-06-09	2	-77/+114
\| \| \| \| \| \| \| \| \|	This reverts commit 10d2e88e87150a35dc367ba30716189d2af26774. This is causing some test failures for some reason, reverting while I investigate. llvm-svn: 334354
*	[Support] Expose flattenWindowsCommandLine.	Zachary Turner	2018-06-09	2	-114/+77
\| \| \| \| \| \| \| \| \| \| \|	This function was internal to Program.inc, but I've needed this on several occasions when I've had to use CreateProcess without llvm's sys::Execute functions. In doing so, I noticed that the function was written using unsafe C-string access and was pretty hard to understand / make sense of, so I've also re-written the functions to use more modern LLVM constructs. llvm-svn: 334353
*	[X86] NFC Use member initialization in X86Subtarget	Gabor Buella	2018-06-09	2	-215/+107
\| \| \| \| \| \| \| \|	The separate initializeEnvironment function was sort of useless since r217071. ARM did this move already with r273556. llvm-svn: 334345
*	Use uniform mechanism for OOM errors handling	Serge Pavlov	2018-06-09	6	-40/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a recommit of r333506, which was reverted in r333518. The original commit message is below. In r325551 many calls of malloc/calloc/realloc were replaces with calls of their safe counterparts defined in the namespace llvm. There functions generate crash if memory cannot be allocated, such behavior facilitates handling of out of memory errors on Windows. If the result of alloc function were checked for success, the function was not replaced with the safe variant. In these cases the calling function made the error handling, like: T NewElts = static_cast<T>(malloc(NewCapacitysizeof(T))); if (NewElts == nullptr) report_bad_alloc_error("Allocation of SmallVector element failed."); Actually knowledge about the function where OOM occurred is useless. Moreover having a single entry point for OOM handling is convenient for investigation of memory problems. This change removes custom OOM errors handling and replaces them with calls to functions `llvm::safe_alloc`. Declarations of `safe_alloc` are moved to a separate include file, to avoid cyclic dependency in SmallVector.h Differential Revision: https://reviews.llvm.org/D47440 llvm-svn: 334344
*	Use SmallPtrSet instead of SmallSet in places where we iterate over the set.	Craig Topper	2018-06-09	5	-7/+7
\| \| \| \| \| \| \| \|	SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work. These places were found by hiding the begin/end methods in the SmallSet forwarding llvm-svn: 334343
*	[ARM] Allow CMPZ transforms even if the input has multiple uses.	Eli Friedman	2018-06-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	It looks like this got left in by accident in r289794; I can't think of any reason this check would be necessary. (Maybe it was meant to be a check that the AND has one use? But we check that a few lines earlier.) Differential Revision: https://reviews.llvm.org/D47921 llvm-svn: 334322
*	[SCEV] Look through zero-extends in howFarToZero	Krzysztof Parzyszek	2018-06-08	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \|	An expression like (zext i2 {(trunc i32 (1 + %B) to i2),+,1}<%while.body> to i32) will become zero exactly when the nested value becomes zero in its type. Strip injective operations from the input value in howFarToZero to make the value simpler. Differential Revision: https://reviews.llvm.org/D47951 llvm-svn: 334318
*	[InstCombine] Skip dbg.value(s) when looking at stack{save,restore}.	Davide Italiano	2018-06-08	1	-1/+8
\| \| \| \| \| \|	Fixes PR37713. llvm-svn: 334317
*	[asan] Instrument comdat globals on COFF targets	Reid Kleckner	2018-06-08	1	-8/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we can use comdats, then we can make it so that the global metadata is thrown away if the prevailing definition of the global was uninstrumented. I have only tested this on COFF targets, but in theory, there is no reason that we cannot also do this for ELF. This will allow us to re-enable string merging with ASan on Windows, reducing the binary size cost of ASan on Windows. Reviewers: eugenis, vitalybuka Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47841 llvm-svn: 334313
*	[DAGCombiner] clean up comments; NFC	Sanjay Patel	2018-06-08	1	-8/+5
\| \| \| \|	llvm-svn: 334312
*	[X86][SSE] Support v8i16/v16i16 rotations	Simon Pilgrim	2018-06-08	1	-14/+30
\| \| \| \| \| \| \| \|	Extension to D46954 (PR37426), this patch adds support for v8i16/v16i16 rotations in a similar manner - the conversion of the shift/rotate amount to a multiplication factor and the use of PMULLW to shift left and PMULHUW (ISD::MULHU) to shift the wrapped bits back around to be ORd together. Differential Revision: https://reviews.llvm.org/D47822 llvm-svn: 334309
*	Utilize new SDNode flag functionality to expand current support for fsub	Michael Berg	2018-06-08	1	-17/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fsub. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D47910 llvm-svn: 334306
*	[VPlan] Move recipe construction to VPRecipeBuilder.	Florian Hahn	2018-06-08	4	-153/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch moves the recipe-creation functions out of LoopVectorizationPlanner, which should do the high-level orchestration of the transformations. Reviewers: dcaballe, rengolin, hsaito, Ayal Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D47595 llvm-svn: 334305
*	[X86][BtVer2] Add support for all SUB/XOR 32/64 scalar instructions that ↵	Simon Pilgrim	2018-06-08	1	-1/+8
\| \| \| \| \| \| \| \|	should match the dependency-breaking 'zero-idiom' As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), these instructions are dependency breaking and fast-path zero the destination register (and appropriate EFLAGS bits). llvm-svn: 334303
*	[AMDGPU] Inline asm - added i16, half and i128 types support	Daniil Fukalov	2018-06-08	1	-16/+32
\| \| \| \| \| \| \| \| \| \|	AMDGPU inline assembler support i16, half and i128 typed variables in constraints, but they were reported as error. Needed to fix https://github.com/RadeonOpenCompute/ROCm/issues/341, e.g. to be able to load with global_load_dwordx4 to a 128bit integer variable Differential Revision: https://reviews.llvm.org/D44920 llvm-svn: 334301
*	reapply r334209 with fixes for harfbuzz in Chromium	Daniil Fukalov	2018-06-08	1	-16/+26
\| \| \| \| \| \| \| \| \| \| \|	r334209 description: [LSR] Check yet more intrinsic pointer operands the patch fixes another assertion in isLegalUse() Differential Revision: https://reviews.llvm.org/D47794 llvm-svn: 334300
*	[NFC][InstSimplify] SimplifyAddInst(): coding style: variable names.	Roman Lebedev	2018-06-08	1	-5/+5
\| \| \| \|	llvm-svn: 334299
*	[InstSimplify] add nuw %x, -1 -> -1 fold.	Roman Lebedev	2018-06-08	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: `%ret = add nuw i8 %x, C` From [[ https://llvm.org/docs/LangRef.html#add-instruction \| langref ]]: nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw and/or nsw keywords are present, the result value of the add is a poison value if unsigned and/or signed overflow, respectively, occurs. So if `C` is `-1`, `%x` can only be `0`, and the result is always `-1`. I'm not sure we want to use `KnownBits`/`LVI` here, because there is exactly one possible value (all bits set, `-1`), so some other pass should take care of replacing the known-all-ones with constant `-1`. The `test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll` change is confusing. What happening is, before this: (omitting `nuw` for simplicity) 1. First, InstCombine D47428/rL334127 folds `shl i32 1, %NBits`) to `shl nuw i32 -1, %NBits` 2. Then, InstSimplify D47883/rL334222 folds `shl nuw i32 -1, %NBits` to `-1`, 3. `-1` is inverted to `0`. But now: 1. This InstSimplify fold `%ret = add nuw i32 %setbit, -1` -> `-1` happens first, before InstCombine D47428/rL334127 fold could happen. Thus we now end up with the opposite constant, and it is all good: https://rise4fun.com/Alive/OA9 https://rise4fun.com/Alive/sldC Was mentioned in D47428 review. Follow-up for D47883. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47908 llvm-svn: 334298
*	commandLineFitsWithinSystemLimits Overestimates System Limits	Alexander Kornienko	2018-06-08	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The function `llvm::sys::commandLineFitsWithinSystemLimits` appears to be overestimating the system limits. This issue was discovered while attempting to enable response files in the Swift compiler. When the compiler submits its frontend jobs, those jobs are subjected to the system limits on command line length. `commandLineFitsWithinSystemLimits` is used to determine if the job's arguments need to be wrapped in a response file. There are some cases where the argument size for the job passes `commandLineFitsWithinSystemLimits`, but actually exceeds the real system limit, and the job fails. `clang` also uses this function to decide whether or not to wrap it's job arguments in response files. See: https://github.com/llvm-mirror/clang/blob/master/lib/Driver/Driver.cpp#L1341. Clang will also fail for response files who's size falls within a certain range. I wrote a script that should find a failure point for `clang++`. All that is needed to run it is Python 2.7, and a simple "hello world" program for `test.cc`. It should run on Linux and on macOS. The script is available here: https://gist.github.com/dabelknap/71bd083cd06b91c5b3cef6a7f4d3d427. When it hits a failure point, you should see a `clang: error: unable to execute command: posix_spawn failed: Argument list too long`. The proposed solution is to mirror the behavior of `xargs` in `commandLinefitsWithinSystemLimits`. `xargs` defaults to 128k for the command line length size (See: https://fossies.org/dox/findutils-4.6.0/buildcmd_8c_source.html#l00551). It adjusts this depending on the value of `ARG_MAX`. Reviewers: alexfh Reviewed By: alexfh Subscribers: llvm-commits Tags: #clang Patch by Austin Belknap! Differential Revision: https://reviews.llvm.org/D47795 llvm-svn: 334295
*	Clean up some code in Program.	Zachary Turner	2018-06-08	2	-10/+12
\| \| \| \| \| \| \| \| \| \|	NFC here, this just raises some platform specific ifdef hackery out of a class and creates proper platform-independent typedefs for the relevant things. This allows these typedefs to be reused in other places without having to reinvent this preprocessor logic. llvm-svn: 334294
*	Add a file open flag that disables O_CLOEXEC.	Zachary Turner	2018-06-08	2	-9/+22
\| \| \| \| \| \| \| \| \| \| \|	O_CLOEXEC is the right default, but occasionally you don't want this. This is especially true for tools like debuggers where you might need to spawn the child process with specific files already open, but it's occasionally useful in other scenarios as well, like when you want to do some IPC between parent and child. llvm-svn: 334293
*	[X86][SSE] Simplify combineVectorTruncationWithPACKUS to reduce code duplication	Simon Pilgrim	2018-06-08	1	-37/+5
\| \| \| \| \| \| \| \| \| \|	Simplify combineVectorTruncationWithPACKUS to mask the upper bits followed by calling truncateVectorWithPACK instead of duplicating with similar code. This results in the codegen using (V)PACKUSDW on SSE41+ targets for vXi64/vXi32 inputs where before it always used PACKUSWB (along with a lot more bitcasting). I've raised PR37749 as until we avoid unnecessary concats back to 256-bit for bitwise ops, we can't avoid splitting the input value into 128-bit subvectors for masking. llvm-svn: 334289
*	[BPI] Apply invoke heuristic before loop branch heuristic	Artur Pilipenko	2018-06-08	1	-11/+8
\| \| \| \| \| \| \| \| \| \|	Currently the loop branch heuristic is applied before the invoke heuristic which makes us overestimate the probability of the unwind destination of invokes inside loops. This in turn makes us grossly underestimate the frequencies of loops with invokes. Reviewed By: skatkov, vsk Differential Revision: https://reviews.llvm.org/D47371 llvm-svn: 334285
*	[VPlan] Move recipe based VPlan generation to separate function.	Florian Hahn	2018-06-08	2	-41/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This first step separates VPInstruction-based and VPRecipe-based VPlan creation, which should make it easier to migrate to VPInstruction based code-gen step by step. Reviewers: Ayal, rengolin, dcaballe, hsaito, mkuper, mzolotukhin Reviewed By: dcaballe Subscribers: bollu, tschuett, rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D47477 llvm-svn: 334284
*	[mips] Correct the predicates for a number of codegen only instructions	Simon Dardis	2018-06-08	1	-37/+52
\| \| \| \| \| \| \| \|	Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47638 llvm-svn: 334280
*	[RISCV] Implement MC layer support for the fence.tso instruction	Alex Bradbury	2018-06-08	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	The instruction makes use of a previously ignored field in the fence instruction. It is introduced in the version 2.3 draft of the RISC-V specification after much work by the Memory Model Task Group. As clarified here <https://github.com/riscv/riscv-isa-manual/issues/186>, the fence.tso assembler mnemonic does not have operands. llvm-svn: 334278
*	[X86][SSE] Consistently prefer lowering to PACKUS over PACKSS	Simon Pilgrim	2018-06-08	1	-32/+32
\| \| \| \| \| \| \| \| \| \|	We have some combines/lowerings that attempt to use PACKSS-then-PACKUS and others that use PACKUS-then-PACKSS. PACKUS is much easier to combine with if we know the upper bits are zero as ComputeKnownBits can easily see through BITCASTs etc. especially now that rL333995 and rL334007 have landed. It also effectively works at byte level which further simplifies shuffle combines. The only (minor) annoyances are that ComputeKnownBits can sometimes take longer as it doesn't fail as quickly as ComputeNumSignBits (but I'm not seeing any actual regressions in tests) and PACKUSDW only became available after SSE41 so we have more codegen diffs between targets. llvm-svn: 334276
*	[LV] Fix PR36983. For a given recurrence, fix all phis in exit block	Roman Shirokiy	2018-06-08	1	-2/+1
\| \| \| \| \| \| \| \| \|	There could be more than one PHIs in exit block using same loop recurrence. Don't assume there is only one and fix each user. Differential Revision: https://reviews.llvm.org/D47788 llvm-svn: 334271
*	AMDGPU: Error on LDS global address in functions	Matt Arsenault	2018-06-08	1	-1/+9
\| \| \| \| \| \| \|	These won't work as expected now, so error on them to avoid wasting time debugging this in the future. llvm-svn: 334269
*	[DAGCombine] Fix for PR37667	Sam Parker	2018-06-08	1	-0/+16
\| \| \| \| \| \| \| \| \| \|	While trying to propagate AND masks back to loads, we currently allow one non-load node to be included as a leaf in chain. This fix now limits that node to produce only a single data value. Differential Revision: https://reviews.llvm.org/D47878 llvm-svn: 334268
*	[NFC] fix formatting	Hiroshi Inoue	2018-06-08	1	-1/+1
\| \| \| \|	llvm-svn: 334263
*	[X86] Improve some shuffle decoding code to remove a conditional from a loop ↵	Craig Topper	2018-06-08	1	-11/+9
\| \| \| \| \| \| \| \|	and reduce the number of temporary variables. NFCI The NumControlBits variable was definitely sketchy. I think that only worked because the expected value was 1 or 2 and the number of lanes was 2 or 4. Had their been 8 lanes the number of bits should have been 3 not 4 as the previous code would have given. llvm-svn: 334258
*	[AMDGPU] Simplify memory legalizer (add missing virtual descructor)	Tony Tye	2018-06-08	1	-0/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D47504 llvm-svn: 334257
*	Revert r334209 "[LSR] Check yet more intrinsic pointer operands"	Reid Kleckner	2018-06-08	1	-12/+4
\| \| \| \| \| \| \|	This causes cast failures when compiling harfbuzz in Chromium. Reproducer on the way. llvm-svn: 334254
*	Expose a single global file open function.	Zachary Turner	2018-06-07	2	-86/+51
\| \| \| \| \| \| \| \| \|	This one allows much more flexibility than the standard openFileForRead / openFileForWrite functions. Since there is now just one "real" function that does the work, all other implementations simply delegate to this one. llvm-svn: 334246
*	propagate fast math flags via IR on fma and sub expressions	Michael Berg	2018-06-07	2	-47/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change uses fmf subflags to guard fma optimizations as well as unsafe. These changes originated from D46483 and have been simplified via getNode. Reviewers: spatel, arsenm, hfinkel, javed.absar Reviewed By: spatel Subscribers: nemanjai, wdng Differential Revision: https://reviews.llvm.org/D47388 llvm-svn: 334242
*	[AMDGPU] Simplify memory legalizer	Tony Tye	2018-06-07	1	-234/+707
\| \| \| \| \| \| \| \| \| \|	- Make code easier to maintain. - Avoid generating waitcnts for VMEM if the address sppace does not involve VMEM. - Add support to generate waitcnts for LDS and GDS memory. Differential Revision: https://reviews.llvm.org/D47504 llvm-svn: 334241
*	[Support] Link libzircon.so when building LLVM for Fuchsia	Petr Hosek	2018-06-07	1	-0/+3
\| \| \| \| \| \| \| \|	This is necessary for zx_* symbols. Differential Revision: https://reviews.llvm.org/D47848 llvm-svn: 334232
*	Fix unused private variable.	Zachary Turner	2018-06-07	1	-1/+2
\| \| \| \| \| \|	This parameter got lost in the refactor. Add it back. llvm-svn: 334223
*	[InstSimplify] shl nuw C, %x -> C iff signbit is set on C.	Roman Lebedev	2018-06-07	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: `%r = shl nuw i8 C, %x` As per langref: ``` If the nuw keyword is present, then the shift produces a poison value if it shifts out any non-zero bits. ``` Thus, if the sign bit is set on `C`, then `%x` can only be `0`, which means that `%r` can only be `C`. Or in other words, set sign bit means that the signed value is negative, so the constant is `<= 0`. https://rise4fun.com/Alive/WMk https://rise4fun.com/Alive/udv Was mentioned in D47428 review. We already handle the `0` constant, https://godbolt.org/g/UZq1sJ, so this only handles negative constants. Could use computeKnownBits() / LazyValueInfo, but the cost-benefit analysis (https://reviews.llvm.org/D47891) suggests it isn't worth it. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47883 llvm-svn: 334222
*	[FileSystem] Split up the OpenFlags enumeration.	Zachary Turner	2018-06-07	8	-144/+251
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This breaks the OpenFlags enumeration into two separate enumerations: OpenFlags and CreationDisposition. The first controls the behavior of the API depending on whether or not the target file already exists, and is not a flags-based enum. The second controls more flags-like values. This yields a more easy to understand API, while also allowing flags to be passed to the openForRead api, where most of the values didn't make sense before. This also makes the apis more testable as it becomes easy to enumerate all the configurations which make sense, so I've added many new tests to exercise all the different values. llvm-svn: 334221
*	DAG: Avoid bitcast/ext/build_vector combine	Matt Arsenault	2018-06-07	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids regressions in a future AMDGPU change to make v4i16/v4f16 legal. For these types, build_vector is implemented as bitcasted operations on v2i32. This combine was creating v4i16s out of what would have been already been a v2i32 build_vector, creating a mess of nodes that never get cleaned up. I'm not sure this is the right condition to check. I initially tried just checking for the legality of the new build_vector. This works for my case, but breaks dozens of x86 tests. A Mips test seems to show some improvement or at least a neutral change. I don't want to think about how long it would take to analyze the set of different x86 vector operations impacted. Test included in future commit. llvm-svn: 334218
*	[TargetLibraryInfo] add mappings from LLVM sin/cos intrinsics to SVML calls	Sanjay Patel	2018-06-07	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These weren't included in D19544 - probably just an oversight. D40044 made it more likely that we'll have LLVM math intrinsics rather than libcalls, so this bug was more easily exposed. As the tests/code show, we already have the complete mappings for pow/exp/log. I don't have any experience with SVML, so I don't know if anything else is missing. It's also not clear to me that we should be doing this transform in IR rather than DAG/isel, but that's a separate issue. Differential Revision: https://reviews.llvm.org/D47610 llvm-svn: 334211
*	[LSR] Check yet more intrinsic pointer operands	Daniil Fukalov	2018-06-07	1	-4/+12
\| \| \| \| \| \| \| \|	the patch fixes another assertion in isLegalUse() Differential Revision: https://reviews.llvm.org/D47794 llvm-svn: 334209
*	[X86][SSE] Updated comment - combineVectorSignBitsTruncation handles PACKSS ↵	Simon Pilgrim	2018-06-07	1	-1/+1
\| \| \| \| \| \|	and PACKUS. NFCI. llvm-svn: 334204
*	[RISCV] AsmParser support for the li pseudo instruction	Alex Bradbury	2018-06-07	3	-14/+168
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The implementation follows the MIPS backend and expands the pseudo instruction directly during asm parsing. As the result, only real MC instructions are emitted to the MCStreamer. The actual expansion to real instructions is similar to the expansion performed by the GNU Assembler. This patch supersedes D41949. Differential Revision: https://reviews.llvm.org/D46118 Patch by Mario Werner. llvm-svn: 334203
*	[AVR] Fix build after r334078	Alex Bradbury	2018-06-07	2	-4/+10
\| \| \| \| \| \| \|	r334078 added MCSubtargetInfo to fixupNeedsRelaxation and applyFixup. This patch makes the necessary adjustment for the AVR target. llvm-svn: 334202
*	[X86][SSE] Simplify combineVectorTruncationWithPACKUS. NFCI.	Simon Pilgrim	2018-06-07	1	-42/+42
\| \| \| \| \| \|	Move code only used by combineVectorTruncationWithPACKUS out of combineVectorTruncation. llvm-svn: 334201
*	[PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelector	Hiroshi Inoue	2018-06-07	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation. However, enforcing 32-bit instruction sometimes results in redundant generated code. For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF). uint64_t func(uint64_t a, uint64_t b) { return (a & 0xFFFFFFFF) \| (b << 32) ; } To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group. Differential Revision: https://reviews.llvm.org/D47867 llvm-svn: 334195