summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert [LFTR] Stylistic cleanup as suggested in last review comment of ↵Florian Hahn2019-06-141-9/+9
| | | | | | | | | | | D62939 [NFC] Reverting because it depends on r363289, which breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363292 (git commit 42a3fc133d3544b5c0c032fe99c6e8a469a836c2) llvm-svn: 363426
* Revert [LFTR] Rename variable to minimize confusion [NFC]Florian Hahn2019-06-141-15/+18
| | | | | | | | | | Reverting because it depends on r363289, which breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363293 (git commit c37be29634214fb1cb4c823840bffc31e5ebfe40) llvm-svn: 363425
* [PowerPC][NFC] Format comments in P9InstrResrouce.tdJinsong Ji2019-06-141-80/+77
| | | | llvm-svn: 363423
* [SimpligyCFG] NFC intended, remove GCD that was only used for powers of twoShawn Landden2019-06-141-13/+11
| | | | | | | | | | | | and replace with an equilivent countTrailingZeros. GCD is much more expensive than this, with repeated division. This depends on D60823 Differential Revision: https://reviews.llvm.org/D61151 llvm-svn: 363422
* [AMDGPU] Don't constrain callees with inlinehint from inlining on MaxBB checkValery Pykhtin2019-06-141-1/+1
| | | | | | | | | | | | | | Summary: Function bodies marked inline in an opencl source are eliminated but MaxBB check may prevent inlining them leaving undefined references. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, Anastasia, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63337 llvm-svn: 363418
* [FPEnv] Lower STRICT_FP_EXTEND and STRICT_FP_ROUND nodes in preprocess phase ↵Kevin P. Neal2019-06-141-45/+110
| | | | | | | | | | | | | of ISelLowering to mirror non-strict nodes on x86. I recently discovered a bug on the x86 platform: The fp80 type was not handled well by x86 for constrained floating point nodes, as their regular counterparts are replaced by extending loads and truncating stores during the preprocess phase. Normally, platforms don't have this issue, as they don't typically attempt to perform such legalizations during instruction selection preprocessing. Before this change, strict_fp nodes survived until they were mutated to normal nodes, which happened shortly after preprocessing on other platforms. This modification lowers these nodes at the same phase while properly utilizing the chain.5 Submitted by: Drew Wock <drew.wock@sas.com> Reviewed by: Craig Topper, Kevin P. Neal Approved by: Craig Topper Differential Revision: https://reviews.llvm.org/D63271 llvm-svn: 363417
* [AMDGPU] gfx1010 BoolReg definition. NFC.Stanislav Mekhanoshin2019-06-142-0/+32
| | | | | | | | | | | Earlier commit has added AMDGPUOperand::isBoolReg(). Turns out gcc issues warning about unused function since D63204 is not yet submitted. Added NFC part of D63204 to have a use of that function and mute the warning. llvm-svn: 363416
* Reland: [Remarks] Refactor optimization remarks setupFrancis Visoiu Mistrih2019-06-145-43/+69
| | | | | | | | * Add a common function to setup opt-remarks * Rename common options to the same names * Add error types to distinguish between file errors and regex errors llvm-svn: 363415
* GlobalISel: Avoid producing Illegal copies in RegBankSelectMatt Arsenault2019-06-142-7/+96
| | | | | | | | | | | | | | | | | | | | | | Avoid producing illegal register bank copies for reg_sequence and phi. The default implementation assumes it is possible to pick any operand's bank and use that for the result, introducing a copy for operands with a different bank. This does not check for illegal copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR operand requires the result to be a VGPR. The changes in getInstrMappingImpl aren't strictly necessary, since AMDGPU now just bypasses this for reg_sequence/phi. This could be replaced with an assert in case other targets run into this. It is currently responsible for producing the error for unsatisfiable copies, but this will be better served with a verifier check. For phis, for now assume any undetermined operands must be VGPRs. Eventually, this needs to be able to defer mapping these operations. This also does not yet have a way to check for whether the block is in a divergent region. llvm-svn: 363410
* [CodeGenPrepare] propagate debuginfo when copying a shuffleSanjay Patel2019-06-141-0/+1
| | | | llvm-svn: 363409
* [Attributor] Disable the Attributor by default and fix a commentJohannes Doerfert2019-06-141-1/+1
| | | | llvm-svn: 363408
* AMDGPU: Fold readlane intrinsics of constantsMatt Arsenault2019-06-141-0/+7
| | | | | | | | I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406
* [ARM] Add MVE horizontal accumulation instructionsMikhail Maltsev2019-06-142-0/+318
| | | | | | | | | This is the family of vector instructions that combine all the lanes in their input vector(s), and output a value in one or two GPRs. Differential Revision: https://reviews.llvm.org/D62670 llvm-svn: 363403
* Fix failing test on ARM buildbotEugene Leviant2019-06-141-1/+1
| | | | | | | | r363261 caused test failure on 32-bit ARM buildbot, because of unsigned integer overflow. This patch fixes it changing offset type from size_t to uint64_t. llvm-svn: 363393
* RegBankSelect: Remove checks for invalid mappingsMatt Arsenault2019-06-141-5/+2
| | | | | | | | | Avoid a check for valid and a set of redundant asserts. The place InstructionMapping is constructed asserts all of the default fields are passed anyway for an invalid mapping, so don't overcomplicate this. llvm-svn: 363391
* AMDGPU: Fix input chain when gluing copies to m0Matt Arsenault2019-06-141-2/+5
| | | | | | | I don't think this was causing any observable issues, but was making reading the DAG dump confusing. llvm-svn: 363389
* [MCA] Ignore invalid processor resource writes of zero cycles. NFCIAndrea Di Biagio2019-06-141-2/+14
| | | | | | | | In debug mode, the tool also raises a warning and prints out a message which helps identify the problematic MCWriteProcResEntry from the scheduling class. This message would have been useful to have when triaging PR42282. llvm-svn: 363387
* Fix not calling TargetCustom PSVs printerMatt Arsenault2019-06-141-1/+1
| | | | | | | If the enum value was greater than the starting target custom value, the custom printer wasn't called. llvm-svn: 363386
* AMDGPU: Refactor to prepare for manually selecting more intrinsicsMatt Arsenault2019-06-141-9/+19
| | | | llvm-svn: 363385
* AMDGPU: Fix printing trailing whitespace after s_endpgmMatt Arsenault2019-06-142-2/+2
| | | | llvm-svn: 363384
* AMDGPU: Fix missing constMatt Arsenault2019-06-142-2/+2
| | | | llvm-svn: 363383
* [ARM] MVE VPT Block PassSjoerd Meijer2019-06-145-0/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initial commit of a new pass to create vector predication blocks, called VPT blocks, that are supported by the Armv8.1-M MVE architecture. This is a first naive implementation. I.e., for 2 consecutive predicated instructions I1 and I2, for example, it will generate 2 VPT blocks: VPST I1 VPST I2 A more optimal implementation would obviously put instructions in the same VPT block when they are predicated on the same condition and when it is allowed to do this: VPTT I1 I2 We will address this optimisation with follow up patches when the groundwork is in. Creating VPT Blocks is very similar to IT Blocks, which is the reason I added this to Thumb2ITBlocks.cpp. This allows reuse of the def use analysis that we need for the more optimal implementation. VPT blocks cannot be nested in IT blocks, and vice versa, and so these 2 passes cannot interact with each other. Instructions allowed in VPT blocks must be MVE instructions that are marked as VPT compatible. Differential Revision: https://reviews.llvm.org/D63247 llvm-svn: 363370
* [yaml2obj] - Allow setting cutom Flags for implicit sections.George Rimar2019-06-141-1/+1
| | | | | | | | | With this patch we get ability to set any flags we want for implicit sections defined in YAML. Differential revision: https://reviews.llvm.org/D63136 llvm-svn: 363367
* [SCEV] Pass NoWrapFlags when expanding an AddExprSam Parker2019-06-141-1/+1
| | | | | | | | | | | | InsertBinop now accepts NoWrapFlags, so pass them through when expanding a simple add expression. This is the first re-commit of the functional changes from rL362687, which was previously reverted. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363364
* Move commentary on opcode translation for code16 mov instructionsEric Christopher2019-06-141-2/+2
| | | | | | | to segment registers closer to the segment register check for when we add further optimizations. llvm-svn: 363355
* DebugInfo: Include enumerators in pubnamesDavid Blaikie2019-06-141-0/+5
| | | | | | | | | | | This is consistent with GCC's behavior (which is the defacto standard for pubnames). Though I find the presence of enumerators from enum classes to be a bit confusing, possibly a bug on GCC's end (since they can't be named unqualified, unlike the other names - and names nested in classes don't go in pubnames, for instance - presumably because one must name the class first & that's enough to limit the scope of the search) llvm-svn: 363349
* [AMDGPU] gfx1011/gfx1012 targetsStanislav Mekhanoshin2019-06-149-1/+154
| | | | | | Differential Revision: https://reviews.llvm.org/D63307 llvm-svn: 363344
* Revert "[Remarks] Refactor optimization remarks setup"Francis Visoiu Mistrih2019-06-145-69/+43
| | | | | | | | This reverts commit 6e6e3af55bb97e1a4c97375c15a2b0099120c5a7. This breaks greendragon. llvm-svn: 363343
* [Coverage] Speculative fix for r363325 for an older compilerVedant Kumar2019-06-141-4/+4
| | | | | | | It looks like an older version of gcc can't figure out that it needs to move a unique_ptr while implicitly constructing an Expected object. llvm-svn: 363342
* [AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32Stanislav Mekhanoshin2019-06-136-25/+69
| | | | | | Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339
* Use fully qualified name when printing S_CONSTANT recordsAmy Huang2019-06-131-4/+5
| | | | | | | | | | | | | | | | Summary: Before it was using the fully qualified name only for static data members. Now it does for all variable names to match MSVC. Reviewers: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63012 llvm-svn: 363335
* Symbolize: Remove dead code. NFCI.Peter Collingbourne2019-06-132-12/+5
| | | | | | | | | | The only caller of SymbolizableObjectFile::create passes a non-null DebugInfoContext and asserts that they do so. Move the assert into SymbolizableObjectFile::create and remove null checks. Differential Revision: https://reviews.llvm.org/D63298 llvm-svn: 363334
* [GlobalISel][IRTranslator] Add debug loc with line 0 to constants emitted ↵Amara Emerson2019-06-131-3/+14
| | | | | | | | | | | | | | | into the entry block. Constants, including G_GLOBAL_VALUE, are all emitted into the entry block which lets us use the vreg def assuming it dominates all other users. However, it can cause jumpy debug behaviour since the DebugLoc attached to these MIs are from a user instruction that could be in a different block. Fixes PR40887. Differential Revision: https://reviews.llvm.org/D63286 llvm-svn: 363331
* [X86Disassembler] Unify the EVEX and VEX code in emitContextTable. Merge the ↵Craig Topper2019-06-131-1/+1
| | | | | | | | | | ATTR_VEXL/ATTR_EVEXL bits. NFCI Merging the two bits shrinks the context table from 16384 bytes to 8192 bytes. Remove the ATTRIBUTE_BITS macro and just create an enum directly. Then fix the ATTR_max define to be 8192 to reflect the table size so we stop hardcoding it separately. llvm-svn: 363330
* [MachinePiepliner] Don't check boundary node in checkValidNodeOrderJinsong Ji2019-06-131-0/+5
| | | | | | | | | | | | | | | | | | | This was exposed by PowerPC target enablement. In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will create a dependence edge to ExitSU to model the live-out latency. This is required for vreg defs with no in-region use, and prefetches with no vreg def. When we build NodeOrder in Scheduler, we ignore these boundary nodes. However, when we check Succs in checkValidNodeOrder, we did not skip them, so we still assume all the nodes have been sorted and in order in Indices array. So when we call lower_bound() for ExitSU, it will return Indices.end(), causing memory issues in following Node access. Differential Revision: https://reviews.llvm.org/D63282 llvm-svn: 363329
* [Remarks] Refactor optimization remarks setupFrancis Visoiu Mistrih2019-06-135-43/+69
| | | | | | | | * Add a common function to setup opt-remarks * Rename common options to the same names * Add error types to distinguish between file errors and regex errors llvm-svn: 363328
* [Coverage] Load code coverage data from archivesVedant Kumar2019-06-132-71/+136
| | | | | | | | | | | | | Support loading code coverage data from regular archives, thin archives, and from MachO universal binaries which contain archives. Testing: check-llvm, check-profile (with {A,UB}San enabled) rdar://51538999 Differential Revision: https://reviews.llvm.org/D63232 llvm-svn: 363325
* [AMDGPU] gfx1010 AMDGPUSetCCOp definitionStanislav Mekhanoshin2019-06-131-1/+1
| | | | | | | It was missing from D63293 and breaks in a debug tablegen w/o this part. llvm-svn: 363323
* [ORC] Rename MaterializationResponsibility resolve and emit methods toLang Hames2019-06-135-16/+17
| | | | | | | | | | | | | notifyResolved/notifyEmitted. The 'notify' prefix better describes what these methods do: they update the JIT symbol states and notify any pending queries that the 'resolved' and 'emitted' states have been reached (rather than actually performing the resolution or emission themselves). Since new states are going to be introduced in the near future (to track symbol registration/initialization) it's worth changing the convention pre-emptively to avoid further confusion. llvm-svn: 363322
* [LangRef] Clarify poison semanticsNikita Popov2019-06-131-0/+2
| | | | | | | | | | | | | | | | | | | | | I find the current documentation of poison somewhat confusing, mainly because its use of "undefined behavior" doesn't seem to align with our usual interpretation (of immediate UB). Especially the sentence "any instruction that has a dependence on a poison value has undefined behavior" is very confusing. Clarify poison semantics by: * Replacing the introductory paragraph with the standard rationale for having poison values. * Spelling out that instructions depending on poison return poison. * Spelling out how we go from a poison value to immediate undefined behavior and give the two examples we currently use in ValueTracking. * Spelling out that side effects depending on poison are UB. Differential Revision: https://reviews.llvm.org/D63044 llvm-svn: 363320
* Add a clarifying comment about branching on poisonPhilip Reames2019-06-131-0/+4
| | | | | | I recently got this wrong (again), and I'm sure I'm not the only one. Put a comment in the logical place someone would look to "fix" the obvious "missed optimization" which arrises based on the common misunderstanding. Hopefully, this will save others time. :) llvm-svn: 363318
* [AMDGPU] gfx1010 base changes for wave32Stanislav Mekhanoshin2019-06-1312-25/+209
| | | | | | Differential Revision: https://reviews.llvm.org/D63293 llvm-svn: 363299
* [LFTR] Rename variable to minimize confusion [NFC]Philip Reames2019-06-131-18/+15
| | | | | | As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here. llvm-svn: 363293
* [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]Philip Reames2019-06-131-9/+9
| | | | llvm-svn: 363292
* Fix a bug w/inbounds invalidation in LFTRPhilip Reames2019-06-131-11/+83
| | | | | | | | | | | | | | This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363289
* [clang][NewPM] Fix broken -O0 test from missing assumptionsLeonard Chan2019-06-131-2/+11
| | | | | | | | | | | Add an AssumptionCache callback to the InlineFuntionInfo used for the AlwaysInlinerPass to match codegen of the AlwaysInlinerLegacyPass to generate llvm.assume. This fixes CodeGen/builtin-movdir.c when new PM is enabled by default. Differential Revision: https://reviews.llvm.org/D63170 llvm-svn: 363287
* [Codegen] Merge tail blocks with no successors after block placementDavid Bolvansky2019-06-131-20/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I found the following case having tail blocks with no successors merging opportunities after block placement. Before block placement: bb0: ... bne a0, 0, bb2: bb1: mv a0, 1 ret bb2: ... bb3: mv a0, 1 ret bb4: mv a0, -1 ret The conditional branch bne in bb0 is opposite to beq. After block placement: bb0: ... beq a0, 0, bb1 bb2: ... bb4: mv a0, -1 ret bb1: mv a0, 1 ret bb3: mv a0, 1 ret After block placement, that appears new tail merging opportunity, bb1 and bb3 can be merged as one block. So the conditional constraint for merging tail blocks with no successors should be removed. In my experiment for RISC-V, it decreases code size. Author of original patch: Jim Lin Reviewers: haicheng, aheejin, craig.topper, rnk, RKSimon, Jim, dmgreen Reviewed By: Jim, dmgreen Subscribers: xbolva00, dschuff, javed.absar, sbc100, jgravelle-google, aheejin, kito-cheng, dmgreen, PkmX, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54411 llvm-svn: 363284
* [AMDGPU] ImmArg and SourceOfDivergence for permlane/dppStanislav Mekhanoshin2019-06-131-0/+5
| | | | | | | | | Added missing ImmArg and SourceOfDivergence to the crosslane intrinsics. Differential Revision: https://reviews.llvm.org/D63216 llvm-svn: 363276
* [EarlyCSE] Ensure equal keys have the same hash valueJoseph Tremoulet2019-06-132-66/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274
* [X86] Use fresh MemOps when emitting VAARG64Simon Pilgrim2019-06-131-8/+15
| | | | | | | | | | Previously it copied over MachineMemOperands verbatim which caused MOV32rm to have store flags set, and MOV32mr to have load flags set. This fixes some assertions being thrown with EXPENSIVE_CHECKS on. Committed on behalf of @luke (Luke Lau) Differential Revision: https://reviews.llvm.org/D62726 llvm-svn: 363268
OpenPOWER on IntegriCloud