summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Remove unused variable. NFC.Benjamin Kramer2018-11-191-1/+0
| | | | llvm-svn: 347188
* [MSP430] Optimize srl/sra in case of A >> (8 + N)Anton Korobeynikov2018-11-192-2/+37
| | | | | | | | | | | There is no variable-length shifts on MSP430. Therefore "eat" 8 bits of shift via bswap & ext. Path by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54623 llvm-svn: 347187
* Fix disturbing warning - NFCISerge Guelton2018-11-191-1/+1
| | | | llvm-svn: 347186
* [X86] Use a pcmpgt with 0 instead of psrad 31, to fill elements with the ↵Craig Topper2018-11-193-18/+18
| | | | | | | | sign bit in v4i32 MULH lowering. The shift requires a copy to avoid clobbering a register. Comparing with 0 uses an xor to produce 0 that will be overwritten with the compare results. So still requires 2 instructions, but should be one byte shorter since it doesn't need to encode an immediate. llvm-svn: 347185
* [LoopSimplifyCFG] Add requires: asserts after rL347183Fangrui Song2018-11-191-0/+1
| | | | llvm-svn: 347184
* [LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switchesMax Kazantsev2018-11-192-9/+361
| | | | | | | | | | | | | | | | This patch introduces infrastructure and the simplest case for constant-folding of branch and switch instructions within loop into unconditional branches. It is useful as a cleanup for such passes as loop unswitching that sometimes produce such branches. Only the simplest case supported in this patch: after the folding, no block should become dead or stop being part of the loop. Support for more sophisticated cases will go separately in follow-up patches. Differential Revision: https://reviews.llvm.org/D54021 Reviewed By: anna llvm-svn: 347183
* [ProfileSummary] Standardize methods and fix commentVedant Kumar2018-11-1911-35/+34
| | | | | | | | | | | | | | | | | | | | | Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182
* [X86] Use compare with 0 to fill an element with sign bits when sign ↵Craig Topper2018-11-198-568/+573
| | | | | | | | extending to v2i64 pre-sse4.1 Previously we used an arithmetic shift right by 31, but that requires a copy to preserve the input. So we might as well materialize a zero and compare to it since the comparison will overwrite the register that contains the zeros. This should be one byte shorter. llvm-svn: 347181
* [X86] Remove most of the SEXTLOAD Custom setOperationAction calls under ↵Craig Topper2018-11-193-211/+118
| | | | | | | | -x86-experimental-vector-widening-legalization. Leave just the v4i8->v4i64 and v8i8->v8i64, but only enable them on pre-sse4.1 targets when 64-bit mode is enabled. In those cases we end up creating sext loads that get scalarized to code that looks better than what we get from loading into a vector register and doing a multiple step sign extend using unpacks and shifts. llvm-svn: 347180
* [PowerPC] Set the default PLT mode on OpenBSD/powerpc to Secure PLT.Brad Smith2018-11-193-4/+13
| | | | | | OpenBSD/powerpc only supports Secure PLT. llvm-svn: 347179
* Replace the UTF-8 characters in the error message.Brad Smith2018-11-182-2/+2
| | | | llvm-svn: 347178
* [X86][SSE] Add SimplifyDemandedVectorElts support for SSE packed i2fp ↵Simon Pilgrim2018-11-184-75/+109
| | | | | | conversions. llvm-svn: 347177
* [X86] Add custom type legalization for extending v4i8/v4i16->v4i64.Craig Topper2018-11-182-209/+148
| | | | | | | | Pre-SSE4.1 sext_invec for v2i64 is complicated because we don't have a v2i64 sra instruction. So instead we sign extend to i32 using unpack and sra, then copy the elements and do a v4i32 sra to fill with sign bits, then interleave the i32 sign extend and the sign bits. So really we're doing to two sign extends but only using half of the v4i32 intermediate result. When the result is more than 128 bits, default type legalization would prefer to split the destination type all the way down to v2i64 with shuffles followed by v16i8/v8i16->v2i64 sext_inreg operations. This results in more instructions than necessary because we are only utilizing the lower 2 elements of the v4i32 intermediate result. Instead we can custom split a v4i8/v4i16->v4i64 sign_extend. Then we can sign extend v4i8/v4i16->v4i32 invec producing a full v4i32 result. Create the sign bit vector as a v4i32 then split and interleave with the sign bits using an punpackldq and punpackhdq. llvm-svn: 347176
* [X86] Add a 32-bit command line with only sse2 to vector-sext.ll and ↵Craig Topper2018-11-182-2/+2073
| | | | | | | | vector-sext.ll to show some of the scalarized load sequences without 64-bit scalar support. Some of these sequeces look pretty bad since we have to copy the sign bit from a 32 bit register to a 64 bit register to finish a sign extend. llvm-svn: 347175
* Revert "Implement basic DidAttach and DidLaunch for DynamicLoaderWindowsDYLD"Zachary Turner2018-11-187-147/+4
| | | | | | | This breaks many tests on Windows, which now all fail with an error such as "Unable to read memory at address <xxxxxxxx>". llvm-svn: 347174
* [X86][SSE] Add SimplifyDemandedVectorElts support for SSE splat-vector-shifts.Simon Pilgrim2018-11-185-33/+59
| | | | | | SSE vector shifts only use the bottom 64-bits of the shift amount vector. llvm-svn: 347173
* [X86] Disable combineToExtendVectorInReg under ↵Craig Topper2018-11-184-94/+154
| | | | | | | | | | | | | | -x86-experimental-vector-widening-legalization. Add custom type legalization for extends. If we widen illegal types instead of promoting, we should be able to rely on the type legalizer to create the vector_inreg operations for us with some caveats. This patch disables combineToExtendVectorInReg when we are using widening. I've enabled custom legalization for v8i8->v8i64 extends under avx512f since the type legalizer would want to create a vector_inreg with a v64i8 input type which isn't legal without avx512bw. So we go to v16i8 with custom code using the relaxation of rules we get from D54346. I've also enable custom legalization of v8i64 and v16i32 operations with with AVX. When the input type is 128 bits, the default splitting legalization would extend first 128->256, then do the a split to two 128 pieces. Extend each half to 256 and then concat the result. The custom legalization I've added instead uses a 128->256 bit vector_inreg extend that only reads the lower 64-bits for the low half of the split. Then shuffles the high 64-bits to the low 64-bits and does another vector_inreg extend. llvm-svn: 347172
* [X86] Lower v16i16->v8i16 truncate using an 'and' with 255, an ↵Craig Topper2018-11-1820-814/+698
| | | | | | | | | | | | | | | | extract_subvector, and a packuswb instruction. Summary: This is an improvement over the two pshufbs and punpcklqdq we'd get otherwise. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54671 llvm-svn: 347171
* [DAG] add undef simplifications for select nodesSanjay Patel2018-11-185-20/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sadly, this duplicates (twice) the logic from InstSimplify. There might be some way to at least share the DAG versions of the code, but copying the folds seems to be the standard method to ensure that we don't miss these folds. Unlike in IR, we don't run DAGCombiner to fixpoint, so there's no way to ensure that we do these kinds of simplifications unless the code is repeated at node creation time and during combines. There were other tests that would become worthless with this improvement that I changed as pre-commits: rL347161 rL347164 rL347165 rL347166 rL347167 I'm not sure how to salvage the remaining tests (diffs in this patch). So the x86 tests verify that the new code is working as intended. The AMDGPU test is actually similar to my motivating case: we have some undef value that has survived to machine IR in an x86 test, and then it gets folded in some weird way, or we crash if we don't transfer the undef flag. But we would have been better off never getting to that point by doing these simplifications. This will lead back to PR32023 someday... https://bugs.llvm.org/show_bug.cgi?id=32023 llvm-svn: 347170
* Remove unused variable. NFCI.Simon Pilgrim2018-11-181-8/+9
| | | | llvm-svn: 347169
* [X86][SSE] Split IsSplatValue into GetSplatValue and IsSplatVectorSimon Pilgrim2018-11-181-36/+40
| | | | | | | | Refactor towards making this recursive (necessary for PR38243 rotation splat detection). IsSplatVector returns the original vector source of the splat and the splat index. GetSplatValue returns the scalar splatted value as an extraction from IsSplatVector. llvm-svn: 347168
* [x86] regenerate full checks; NFCSanjay Patel2018-11-181-5/+26
| | | | llvm-svn: 347167
* [SystemZ] make test immune to improvements in undef simplificationSanjay Patel2018-11-181-2/+2
| | | | llvm-svn: 347166
* [Hexagon] make tests immune to improvements in undef simplificationSanjay Patel2018-11-183-8/+8
| | | | llvm-svn: 347165
* [ARM] make test immune to improvements in undef simplificationSanjay Patel2018-11-181-2/+2
| | | | llvm-svn: 347164
* Add the abseil-duration-factory-scale check.Aaron Ballman2018-11-188-0/+483
| | | | | | | | This check removes unneeded scaling of arguments when calling Abseil Time factory functions. Patch by Hyrum Wright. llvm-svn: 347163
* [X86][SSE] Relax IsSplatValue - remove the 'variable shift' limit on subtracts.Simon Pilgrim2018-11-184-114/+46
| | | | | | Means we don't use the per-lane-shifts as much when we can cheaply use the older splat-variable-shifts. llvm-svn: 347162
* [x86] make tests immune to improvements in undef handlingSanjay Patel2018-11-182-19/+30
| | | | llvm-svn: 347161
* [SelectionDAG] simplify code; NFCSanjay Patel2018-11-181-6/+5
| | | | llvm-svn: 347160
* [X86][SSE] Add some generic masked gather codegen testsSimon Pilgrim2018-11-181-0/+1156
| | | | llvm-svn: 347159
* [X86][SSE] Use raw shuffle mask decode in ↵Simon Pilgrim2018-11-188-202/+197
| | | | | | | | SimplifyDemandedVectorEltsForTargetNode (PR39549) We were using the 'normalized' shuffle mask from resolveTargetShuffleInputs, which replaces zero/undef inputs with sentinel values. For SimplifyDemandedVectorElts we need the raw mask so we can correctly demand those 'zero' inputs that got normalized away, this requires an extra bit of logic to locally normalize undef inputs. llvm-svn: 347158
* [analyzer][NFC] Move CheckerOptInfo to CheckerRegistry.cpp, and make it localKristof Umann2018-11-184-81/+58
| | | | | | | | | CheckerOptInfo feels very much out of place in CheckerRegistration.cpp, so I moved it to CheckerRegistry.h. Differential Revision: https://reviews.llvm.org/D54397 llvm-svn: 347157
* Swap order of discovering of -ltinfo and -lterminfoKamil Rytarowski2018-11-181-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: NetBSD ships with native curses(3) and -ltinfo is a part of ncurses. Set -lterminfo before -ltinfo, as it allows to prioritize native curses libraries. Mixing curses and ncurses does not work well, especially in software built on top of llvm. Original patch by Ryo Onodera (NetBSD) in pkgsrc. Reviewers: labath, dim, mgorny Reviewed By: dim, mgorny Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54650 llvm-svn: 347156
* [WebAssembly] Add null streamer supportHeejin Ahn2018-11-183-0/+42
| | | | | | | | | | | | Summary: Now `llc -filetype=null` works. Reviewers: eush Subscribers: dschuff, jgravelle-google, sbc100, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54660 llvm-svn: 347155
* [WebAssembly] Add equality comparison operators for WasmEventTypeHeejin Ahn2018-11-181-0/+8
| | | | | | | | | | | | | | Summary: This was missing in D54096. Independent tests for this is not available here, because these are used in lld. Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54662 llvm-svn: 347154
* [analyzer][UninitializedObjectChecker] Uninit regions are only reported onceKristof Umann2018-11-184-18/+105
| | | | | | | | | | Especially with pointees, a lot of meaningless reports came from uninitialized regions that were already reported. This is fixed by storing all reported fields to the GDM. Differential Revision: https://reviews.llvm.org/D51531 llvm-svn: 347153
* cmake: z3: Remove EXACT from 4.7.1 after being compatible with 4.8.1Jan Kratochvil2018-11-181-1/+1
| | | | | | | | | | After check-in of D54391 a comment there by @mikhail.ramalho says: Since we're supporting version 4.8.1 now, the cmake file should be changed to "minimum" instead of "exact". Differential Revision: https://reviews.llvm.org/D54535 llvm-svn: 347152
* [X86] Add -x86-experimental-vector-widening-legalization check to ↵Craig Topper2018-11-181-2/+5
| | | | | | | | combineSelect and combineSetCC to cover vXi16/vXi8 promotion without BWI. I don't yet have any test cases for this, but its the right thing to do based on log file inspection. llvm-svn: 347151
* [X86] Rename WidenMaskArithmetic->PromoteMaskArithmetic since we usually use ↵Craig Topper2018-11-181-4/+4
| | | | | | widen to refer to adding elements not making elements larger. NFC llvm-svn: 347150
* [X86] Don't use a pmaddwd for vXi32 multiply if the inputs are zero extends ↵Craig Topper2018-11-183-48/+52
| | | | | | | | from i8 or smaller without SSE4.1. Prefer to shrink the mul instead. The zero extend will require two stages of unpacks to implement. So its better to shrink the multiply using pmullw and then extend that result back to v4i32 using a single unpack. llvm-svn: 347149
* tighten up a couple of assertions. hitting the BitPosition == BitWidth case ↵John Regehr2018-11-181-2/+2
| | | | | | that was previously not caught resulted in nasty corruption of APInts that (on my system at least) could not be detected using UBSan, ASan, or Valgrind. this patch does not cause any extra failures in a check-all nor does it interfere with bootstrapping. David Blaikie informally approved this change. llvm-svn: 347148
* [CorrelatedValuePropagation] Preserve debug locations (PR38178)Vedant Kumar2018-11-184-15/+34
| | | | | | | | | Fix all of the missing debug location errors in CVP found by debugify. This includes the missing-location-after-udiv-truncation case described in llvm.org/PR38178. llvm-svn: 347147
* Fix bot failure from r347145Teresa Johnson2018-11-171-8/+7
| | | | | | | | The #if check around the statistics computation gave an error about the statistic being an unused variable. Instead, guard with AreStatisticsEnabled(). llvm-svn: 347146
* [ThinLTO] Add some stats for read only variable internalizationTeresa Johnson2018-11-172-1/+24
| | | | | | | | | | | | | | | Summary: Follow up to D49362 ([ThinLTO] Internalize read only globals). Add a statistic on the number of read only variables (only counting live variables since dead variables will be dropped anyway). Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54642 llvm-svn: 347145
* [Clang] Add options -fprofile-filter-files and -fprofile-exclude-files to ↵Calixte Denizet2018-11-1710-0/+182
| | | | | | | | | | | | | | | | | | filter the files to instrument with gcov (after revert https://reviews.llvm.org/rL346659) Summary: the previous patch (https://reviews.llvm.org/rC346642) has been reverted because of test failure under windows. So this patch fix the test cfe/trunk/test/CodeGen/code-coverage-filter.c. Reviewers: marco-c Reviewed By: marco-c Subscribers: cfe-commits, sylvestre.ledru Differential Revision: https://reviews.llvm.org/D54600 llvm-svn: 347144
* [X86] Add support for matching PACKUSWB from a v64i8 shuffle.Craig Topper2018-11-172-8/+8
| | | | llvm-svn: 347143
* [X86] Add test case to show missed opportunity to use PACKUSWB in v64i8 ↵Craig Topper2018-11-171-0/+47
| | | | | | shuffle lowering. llvm-svn: 347142
* Sink BuryPointer from Clang into LLVM for reuse thereDavid Blaikie2018-11-178-35/+18
| | | | llvm-svn: 347141
* Move BuryPointer from Clang to LLVM for use in other LLVM toolsDavid Blaikie2018-11-173-0/+62
| | | | | | | Specifically planning to use this in llvm-symbolizer to remove the cost of cleanup there. llvm-svn: 347140
* [X86][SSE] Add shuffle demanded elts test case for PR39549Simon Pilgrim2018-11-171-0/+22
| | | | llvm-svn: 347139
OpenPOWER on IntegriCloud