summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [GISel][NFC]: Move getOpcodeDef from the LegalizationArtifactCombiner into ↵Aditya Nandakumar2017-11-151-0/+16
| | | | | | GlobalISel/Utils for use elsewhere llvm-svn: 318350
* AMDGPU: Replace i64 add/sub loweringMatt Arsenault2017-11-155-9/+146
| | | | | | | | | | | | | | | Use VOP3 add/addc like usual. This has some tradeoffs. Inline immediates fold a little better, but other constants are worse off. SIShrinkInstructions could be made smarter to handle these cases. This allows us to avoid selecting scalar adds where we need to track the carry in scc and replace its users. This makes it easier to use the carryless VALU adds. llvm-svn: 318340
* [AArch64] Refactor the loads and stores optimizerEvandro Menezes2017-11-151-143/+143
| | | | | | | | | Move remaining inline matching of instructions of some optimizations into separate functions, like in the other optimizations. Otherwise, NFC. Differential revision: https://reviews.llvm.org/D40090 llvm-svn: 318335
* [X86] Add some explanatory comments to the ProcessorFeatures enum in Host.cpp.Craig Topper2017-11-151-1/+4
| | | | llvm-svn: 318331
* [X86] Add a return to the end of a switch to prevent an accidental ↵Craig Topper2017-11-151-0/+1
| | | | | | fallthrough in the future. llvm-svn: 318330
* [InstCombine] trunc (binop X, C) --> binop (trunc X, C')Sanjay Patel2017-11-151-4/+17
| | | | | | | | | Note that one-use and shouldChangeType() are checked ahead of the switch. Without the narrowing folds, we can produce inferior vector code as shown in PR35299: https://bugs.llvm.org/show_bug.cgi?id=35299 llvm-svn: 318323
* Use TempFile in lto caching.Rafael Espindola2017-11-152-30/+46
| | | | | | | | | | This requires a small change to TempFile: allowing a discard after a failed keep. With this the cache now handles signals and reuses a fd instead of reopening the file. llvm-svn: 318322
* [PowerPC] Implement mayBeEmittedAsTailCall for PPCSean Fertile2017-11-152-0/+39
| | | | | | | | | Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables CodeGenPrepare to duplicate returns when they might enable a tail-call. Differential Revision: https://reviews.llvm.org/D39777 llvm-svn: 318321
* [InstCombine] Salvage debug info during initial DCEReid Kleckner2017-11-151-0/+1
| | | | | | | | | | | InstCombine salvages debug info for every instruction it erases from its worklist, but it wasn't doing it during its initial DCE when populating its worklist. This fixes that. This should help improve availability of 'this' in optimized debug info when casts are necessary. llvm-svn: 318320
* [AArch64] Adjust the cost model for Exynos M1 and M2Evandro Menezes2017-11-151-17/+14
| | | | | | | Fix the modeling of loads and stores using the pre or post indexed addressing modes. llvm-svn: 318312
* [X86] Add CBW/CDQ/CDQE/CQO/CWD/CWDE to WriteALU schedule classSimon Pilgrim2017-11-152-32/+31
| | | | | | | | Some CPUs are already overriding these sign extension instructions but we should be able to use the WriteALU schedule class by default. Differential Revision: https://reviews.llvm.org/D39899 llvm-svn: 318308
* [SLP] Added more missed optimization remarksAdam Nemet2017-11-151-14/+74
| | | | | | | | | | | | | | | | | | | | Summary: Added more remarks to SLP pass, in particular "missed" optimization remarks. Also proposed several tests for new functionality. Patch by Vladimir Miloserdov! For reference you may look at: https://reviews.llvm.org/rL302811 Reviewers: anemet, fhahn Reviewed By: anemet Subscribers: javed.absar, lattner, petecoup, yakush, llvm-commits Differential Revision: https://reviews.llvm.org/D38367 llvm-svn: 318307
* [PowerPC] Split out the tailcall calling convention checks. NFC.Sean Fertile2017-11-151-11/+19
| | | | | | | | Move the calling convention checks for tail-call eligibility for the 64-bit SysV ABI into a separate function. This is so that it can be shared with 'mayBeEmittedAsTailCall' in a subsequent change. llvm-svn: 318305
* [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PMSanjay Patel2017-11-151-2/+7
| | | | | | | | | | | | | | | | | | This is a recommit of r316869 which was speculatively reverted with r317444 and subsequently shown to not be the cause of PR35210. That crash should be fixed after r318237. Original commit message: The old PM sets the options of what used to be known as "latesimplifycfg" on the instantiation after the vectorizers have run, so that's what we'redoing here. FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not set the "late" options. I'm not sure if that's intentional or not. Differential Revision: https://reviews.llvm.org/D39407 llvm-svn: 318299
* [Reassociate] simplify code; NFCISanjay Patel2017-11-151-6/+3
| | | | llvm-svn: 318298
* [AArch64][SVE] Asm: Report SVE parsing diagnostics only onceSander de Smalen2017-11-151-25/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Prevent an issue where a diagnostic is reported multiple times by bailing out with a ParseFail if an invalid SVE register element qualifier/suffix is specified, for example: <stdin>:10:18: error: invalid sve vector kind qualifier add z20.h, z2.h, z31.x ^ <stdin>:10:18: error: invalid sve vector kind qualifier add z20.h, z2.h, z31.x ... <stdin>:10:18: error: invalid sve vector kind qualifier add z20.h, z2.h, z31.x ^ Reviewers: fhahn, rengolin Reviewed By: rengolin Subscribers: aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D39894 llvm-svn: 318297
* [mips] Improve genConstMult() to work with arbitrary precisionPetar Jovanovic2017-11-151-11/+9
| | | | | | | | | | | APInt is now used instead of uint64_t in function genConstMult() allowing multiplication optimizations with constants of arbitrary length. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D38130 llvm-svn: 318296
* [ARM] Split Arm jump table branch into i12 and rs suffixed versionsMomchil Velikov2017-11-155-210/+33
| | | | | | | | | This is a refactoring/cleanup of Arm `addrmode2` operand class. The patch removes it completely. Differential Revision: https://reviews.llvm.org/D39832 llvm-svn: 318291
* [DebugInfo] Fix potential CU mismatch for SubprogramScopeDIEs.Jonas Devlieghere2017-11-152-8/+17
| | | | | | | | | | | | | | | | | In constructAbstractSubprogramScopeDIE there can be a potential mismatch between `this` and the CU of ContextDIE when a scope is shared between two DISubprograms belonging to a different CU. In that case, `this` is the CU that was specified in the IR, but the CU of ContextDIE is that of the first subprogram that was emitted. This patch fixes the mismatch by looking up the CU of ContextDIE, and switching to use that. This fixes PR35212 (https://bugs.llvm.org/show_bug.cgi?id=35212) Patch by Philip Craig! Differential revision: https://reviews.llvm.org/D39981 llvm-svn: 318289
* [Lint] Don't warn about passing alloca'd value to tail call if using byvalMikael Holmen2017-11-151-8/+17
| | | | | | | | | | | | | | | | | | | Summary: This fixes PR35241. When using byval, the data is effectively copied as part of the call anyway, so the pointer returned by the alloca will not be leaked to the callee and thus there is no reason to issue a warning. Reviewers: rnk Reviewed By: rnk Subscribers: Ka-Ka, llvm-commits Differential Revision: https://reviews.llvm.org/D40009 llvm-svn: 318279
* [X86] Redefine the 128-bit version of VPGATHERQD and VGATHERQPS to use a VK2 ↵Craig Topper2017-11-153-14/+24
| | | | | | | | | | mask instead of a VK4 mask. This allows us to remove extra extend creation during lowering and more accurately reflects the semantics of the instruction. While there add an extra output VT to X86 masked gather node to better match the isel pattern predicate. Currently we're exploiting the fact that the isel table doesn't count how many output results a node actually has if the result type of any can be inferred from the first result and the type constraints defined in tablegen. I think we might ultimately want to lower all MGATHER/MSCATTER to an X86ISD node with the extra mask result and stop relying on this hole in the isel checking. llvm-svn: 318278
* NFC Remove default argument of DataLayout::getPointerABIAlignmentFangrui Song2017-11-153-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D40005 llvm-svn: 318272
* [X86] Add getHostCPUName support for the Gemini Lake model number which also ↵Craig Topper2017-11-151-2/+3
| | | | | | uses Goldmont. llvm-svn: 318271
* [X86] Add getHostCPUName support for cannonlake.Craig Topper2017-11-151-7/+21
| | | | | | This adds an explicit model number check and fallback path to the unknown family 6 detection. llvm-svn: 318270
* [InstCombine] Simplify binops that are only used by a select and are fed by ↵Craig Topper2017-11-151-0/+40
| | | | | | | | | | | | | | | | | | | a select with the same condition. Summary: This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select. As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero? Reviewers: spatel, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39999 llvm-svn: 318267
* [PowerPC] fix up in redundant compare eliminationHiroshi Inoue2017-11-151-2/+6
| | | | | | This patch fixes a potential problem in my previous commit (https://reviews.llvm.org/rL312514) by introducing an additional check. llvm-svn: 318266
* AMDGPU: Add separate definitions for DS insts without m0 useMatt Arsenault2017-11-151-154/+207
| | | | llvm-svn: 318246
* AMDGPU: Don't use MUBUF vaddr if address may overflowMatt Arsenault2017-11-156-2/+60
| | | | | | | Effectively revert r263964. Before we would not allow this if vaddr was not known to be positive. llvm-svn: 318240
* Revert r318193 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' ↵Hans Wennborg2017-11-151-318/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | elements in integer binary ops." It crashes building sqlite; see reply on the llvm-commits thread. > [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. > > Patch tries to improve vectorization of the following code: > > void add1(int * __restrict dst, const int * __restrict src) { > *dst++ = *src++; > *dst++ = *src++ + 1; > *dst++ = *src++ + 2; > *dst++ = *src++ + 3; > } > Allows to vectorize even if the very first operation is not a binary add, but just a load. > > Fixed issues related to previous commit. > > Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev > > Reviewed By: ABataev, RKSimon > > Subscribers: llvm-commits, RKSimon > > Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 318239
* [LoopRotate] processLoop should return true even if it just simplified the ↵Craig Topper2017-11-151-1/+1
| | | | | | | | | | | | | | loop latch without making any other changes Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened. PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch. Fixes PR35210. Differential Revision: https://reviews.llvm.org/D40035 llvm-svn: 318237
* [asan] Prevent rematerialization of &__asan_shadow.Evgeniy Stepanov2017-11-151-12/+30
| | | | | | | | | | | | | | | | | | | | Summary: In the mode when ASan shadow base is computed as the address of an external global (__asan_shadow, currently on android/arm32 only), regalloc prefers to rematerialize this value to save register spills. Even in -Os. On arm32 it is rather expensive (2 loads + 1 constant pool entry). This changes adds an inline asm in the function prologue to suppress this behavior. It reduces AsanTest binary size by 7%. Reviewers: pcc, vitalybuka Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40048 llvm-svn: 318235
* AMDGPU: Handle or in multi-use shl ptr combineMatt Arsenault2017-11-141-2/+2
| | | | llvm-svn: 318223
* [EntryExitInstrumenter] Placate GCC, the semicolon is redundant. NFCI.Davide Italiano2017-11-141-3/+2
| | | | llvm-svn: 318217
* [Reassociate] use dyn_cast instead of isa+cast; NFCISanjay Patel2017-11-141-9/+9
| | | | llvm-svn: 318212
* [GISel]: Rework legalization algorithm for better elimination ofAditya Nandakumar2017-11-141-101/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | artifacts along with DCE Legalization Artifacts are all those insts that are there to make the type system happy. Currently, the target needs to say all combinations of extends and truncs are legal and there's no way of verifying that post legalization, we only have *truly* legal instructions. This patch changes roughly the legalization algorithm to process all illegal insts at one go, and then process all truncs/extends that were added to satisfy the type constraints separately trying to combine trivial cases until they converge. This has the added benefit that, the target legalizerinfo can only say which truncs and extends are okay and the artifact combiner would combine away other exts and truncs. Updated legalization algorithm to roughly the following pseudo code. WorkList Insts, Artifacts; collect_all_insts_and_artifacts(Insts, Artifacts); do { for (Inst in Insts) legalizeInstrStep(Inst, Insts, Artifacts); for (Artifact in Artifacts) tryCombineArtifact(Artifact, Insts, Artifacts); } while(!Insts.empty()); Also, wrote a simple wrapper equivalent to SetVector, except for erasing, it avoids moving all elements over by one and instead just nulls them out. llvm-svn: 318210
* Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."Simon Dardis2017-11-147-1/+375
| | | | | | | | | | | | | | | | | | | | This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win buildbot. Unlike many other instructions, these instructions have aliases which take coprocessor registers, gpr register, accumulator (and dsp accumulator) registers, floating point registers, floating point control registers and coprocessor 2 data and control operands. For the moment, these aliases are treated as pseudo instructions which are expanded into the underlying instruction. As a result, disassembling these instructions shows the underlying instruction and not the alias. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35253 llvm-svn: 318207
* Make salvageDebugInfo of casts work for dbg.declare and dbg.addrReid Kleckner2017-11-141-6/+16
| | | | | | | | | | | | | | | | | | | | | Summary: Instcombine (and probably other passes) sometimes want to change the type of an alloca. To do this, they generally create a new alloca with the desired type, create a bitcast to make the new pointer type match the old pointer type, replace all uses with the cast, and then simplify the casts. We already knew how to salvage dbg.value instructions when removing casts, but we can extend it to cover dbg.addr and dbg.declare. Fixes a debug info quality issue uncovered in Chromium in http://crbug.com/784609 Reviewers: aprantl, vsk Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40042 llvm-svn: 318203
* [CodeGen] Peel off the dominant case in switch statement in loweringRong Xu2017-11-142-2/+92
| | | | | | | | | | This patch peels off the top case in switch statement into a branch if the probability exceeds a threshold. This will help the branch prediction and avoids the extra compares when lowering into chain of branches. Differential Revision: http://reviews.llvm.org/D39262 llvm-svn: 318202
* Fix unused variable warning.Richard Smith2017-11-141-1/+0
| | | | llvm-svn: 318201
* Rename CountingFunctionInserter and use for both mcount and cygprofile ↵Hans Wennborg2017-11-1410-62/+157
| | | | | | | | | | | | | | | | | | | | | | calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195
* [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in ↵Dinar Temirbulatov2017-11-141-140/+318
| | | | | | | | | | | | | | | | | | | | | | | | | | | integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { *dst++ = *src++; *dst++ = *src++ + 1; *dst++ = *src++ + 2; *dst++ = *src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Fixed issues related to previous commit. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev Reviewed By: ABataev, RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 318193
* AMDGPU: Error on stack size overflowMatt Arsenault2017-11-142-6/+12
| | | | llvm-svn: 318189
* [SystemZ] Do not crash when selecting an OR of two constantsUlrich Weigand2017-11-141-2/+4
| | | | | | | | | | | In rare cases, common code will attempt to select an OR of two constants. This confuses the logic in splitLargeImmediate, causing an internal error during isel. Fixed by simply leaving this case to common code to handle. This fixes PR34859. llvm-svn: 318187
* [AArch64] Adjust the cost model for Exynos M1 and M2Evandro Menezes2017-11-141-11/+9
| | | | | | Fix the modeling of loads and stores of registers pairs. llvm-svn: 318186
* [ARM, AArch64] Fix an assert message, Darwin isn't the only target ↵Martin Storsjo2017-11-142-2/+4
| | | | | | supporting TLS. NFC. llvm-svn: 318184
* [CodeGenPrepare] Disable div bypass when working set size is huge.Easwaran Raman2017-11-141-3/+4
| | | | | | | | | | | | | | | | | | | Summary: Bypass of slow divs based on operand values is currently disabled for -Os. Do the same when profile summary is available and the working set size of the application is huge. This is similar to how loop peeling is guarded by hasHugeWorkingSetSize. In the div bypass case, the generated extra code (and the extra branch) tendss to outweigh the benefits of the bypass. This results in noticeable performance improvement on an internal application. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39992 llvm-svn: 318179
* [SystemZ] Fix invalid codegen using RISBMux on out-of-range bitsUlrich Weigand2017-11-141-1/+9
| | | | | | | | Before using the 32-bit RISBMux set of instructions we need to verify that the input bits are actually within range of the 32-bit instruction. This fixer PR35289. llvm-svn: 318177
* Mark intrinsics operating on the whole warp as IntrInaccessibleMemOnlyArtem Belevich2017-11-142-10/+21
| | | | | | | It's needed to model the fact that they do access data from other threads in a warp and thus can't be CSE'd. llvm-svn: 318173
* CodeGen: Fix TargetLowering::LowerCallTo for sret value typeYaxun Liu2017-11-141-1/+1
| | | | | | | | | | | | | | TargetLowering::LowerCallTo assumes that sret value type corresponds to a pointer in default address space, which is incorrect, since sret value type should correspond to a pointer in alloca address space, which may not be the default address space. This causes assertion for amdgcn target in amdgiz environment. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39996 llvm-svn: 318167
* [PredicateInfo] Stable sort ValueDFS to remove non-deterministic orderingMandeep Singh Grang2017-11-141-1/+1
| | | | | | | | | | | | | | Summary: This fixes failure in Transforms/Util/PredicateInfo/testandor.ll uncovered by D39245. Reviewers: dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39630 llvm-svn: 318165
OpenPOWER on IntegriCloud