summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Use a bit of relaxed constexpr to make FeatureBitset costant intializableBenjamin Kramer2019-08-243-12/+11
| | | | | | | | | | | This requires std::intializer_list to be a literal type, which it is starting with C++14. The downside is that std::bitset is still not constexpr-friendly so this change contains a re-implementation of most of it. Shrinks clang by ~60k. llvm-svn: 369847
* [Constant] Add 'isElementWiseEqual()' methodRoman Lebedev2019-08-242-16/+17
| | | | | | | | | | | Promoting it from InstCombine's tryToReuseConstantFromSelectInComparison(). Return true if this constant and a constant 'Y' are element-wise equal. This is identical to just comparing the pointers, with the exception that for vectors, if only one of the constants has an `undef` element in some lane, the constants still match. llvm-svn: 369842
* [InstCombine] matchThreeWayIntCompare(): commutativity awarenessRoman Lebedev2019-08-241-13/+42
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: `matchThreeWayIntCompare()` looks for ``` select i1 (a == b), i32 Equal, i32 (select i1 (a < b), i32 Less, i32 Greater) ``` but both of these selects/compares can be in it's commuted form, so out of 8 variants, only the two most basic ones is handled. This fixes regression being introduced in D66232. Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66607 llvm-svn: 369841
* [InstCombine] Try to reuse constant from select in leading comparisonRoman Lebedev2019-08-242-12/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If we have e.g.: ``` %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 ``` the constants `65535` and `65536` are suspiciously close. We could perform a transformation to deduplicate them: ``` Name: ult %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 => %t.inv = icmp ugt i32 %x, 65535 %r = select i1 %t.inv, i32 65535, i32 %y ``` https://rise4fun.com/Alive/avb While this may seem esoteric, this should certainly be good for vectors (less constant pool usage) and for opt-for-size - need to have only one constant. But the real fun part here is that it allows further transformation, in particular it finishes cleaning up the `clamp` folding, see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`. We start with e.g. ``` %dont_need_to_clamp_positive = icmp sle i32 %X, 32767 %dont_need_to_clamp_negative = icmp sge i32 %X, -32768 %clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767 %dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative %R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit ``` without this patch we currently produce ``` %1 = icmp slt i32 %X, 32768 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1, i32 %3, i32 32767 ``` which isn't really a `clamp` - both comparisons are performed on the original value, this patch changes it into ``` %1.inv = icmp sgt i32 %X, 32767 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1.inv, i32 32767, i32 %3 ``` and then the magic happens! Some further transform finishes polishing it and we finally get: ``` %t1 = icmp sgt i32 %X, -32768 %t2 = select i1 %t1, i32 %X, i32 -32768 %t3 = icmp slt i32 %t2, 32767 %R = select i1 %t3, i32 %t2, i32 32767 ``` which is beautiful and just what we want. Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization: https://rise4fun.com/Alive/THl Proofs for the fold itself: https://rise4fun.com/Alive/THl Reviewers: spatel, dmgreen, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66232 llvm-svn: 369840
* [X86] Add an assert to mark more code that needs to be removed when the ↵Craig Topper2019-08-241-1/+4
| | | | | | vector widening legalization switch is removed again. llvm-svn: 369837
* [LoopFusion] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off buildFangrui Song2019-08-241-4/+4
| | | | llvm-svn: 369836
* [GlobalISel] Introduce a G_DYN_STACKALLOC opcode to represent dynamic allocas.Amara Emerson2019-08-241-0/+21
| | | | | | | | | This just adds the opcode and verifier, it will be used to replace existing dynamic alloca handling in a subsequent patch. Differential Revision: https://reviews.llvm.org/D66677 llvm-svn: 369833
* [LLVM][NFC] Removing unused functionsGuillaume Chatelet2019-08-232-7/+1
| | | | | | | | | | | | | | Summary: Removes a not so useful function from DataLayout and cleans up Support/MathExtras.h Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66691 llvm-svn: 369824
* [AMDGPU] Check for immediate SrcC in mfma in AsmParserStanislav Mekhanoshin2019-08-231-0/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D66674 llvm-svn: 369819
* [AMDGPU] w/a for gfx908 mfma SrcC literal HW bugStanislav Mekhanoshin2019-08-231-3/+9
| | | | | | | | gfx908 ignores an mfma if SrcC is a literal. Differential Revision: https://reviews.llvm.org/D66670 llvm-svn: 369818
* [AMDGPU] w/a for gfx908 mfma SrcC literal HW bugStanislav Mekhanoshin2019-08-236-5/+30
| | | | | | | | gfx908 ignores an mfma if SrcC is a literal. Differential Revision: https://reviews.llvm.org/D66670 llvm-svn: 369816
* hwasan: Fix use of uninitialized memory.Peter Collingbourne2019-08-231-11/+12
| | | | | | | Reported by e.g. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/23071/steps/build%20with%20ninja/logs/stdio llvm-svn: 369815
* [LLVM][NFC] remove unused fieldsGuillaume Chatelet2019-08-231-2/+0
| | | | | | | | | | | | | | | | | | | Summary: Here is the commit introducing the fields https://github.com/llvm/llvm-project/commit/cf6749e4c091 It dates back from 2006 and was used by AArch64 backend. There is no more reference to these fields in the whole codebase so I think it's fine. Reviewers: courbet Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66683 llvm-svn: 369810
* [ORC] Remove query dependencies when symbols are resolved.Lang Hames2019-08-231-0/+1
| | | | | | | | | | If the dependencies are not removed then a late failure (one symbol covered by the query failing after others have already been resolved) can result in an attempt to detach the query from already finalized symbol, resulting in an assert/crash. This patch fixes the issue by removing query dependencies in JITDylib::resolve for symbols that meet the required state. llvm-svn: 369809
* [ORC] Fix a FIXME: Propagate errors to dependencies.Lang Hames2019-08-235-100/+339
| | | | | | | | | | | | | When symbols are failed (via MaterializationResponsibility::failMaterialization) any symbols depending on them will now be moved to an error state. Attempting to resolve or emit a symbol in the error state (via the notifyResolved or notifyEmitted methods on MaterializationResponsibility) will result in an error. If notifyResolved or notifyEmitted return an error due to failure of a dependence then the caller should log or discard the error and call failMaterialization to propagate the failure to any queries waiting on the symbols being resolved/emitted (plus their dependencies). llvm-svn: 369808
* [AArch64][GlobalISel] Import XRO load/store patterns instead of custom selectionJessica Paquette2019-08-232-66/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using custom C++ in `earlySelect` for loads and stores, just import the patterns. Remove `earlySelectLoad`, since we can just import the work it's doing. Some minor changes to how `ComplexRendererFns` are returned for the XRO addressing modes. If you add immediates in two steps, sometimes they are not imported properly and you only end up with one immediate. I'm not sure if this is intentional. - Update load-addressing-modes.mir to include the instructions we can now import. - Add a similar test, store-addressing-modes.mir to show which store opcodes we currently import, and show that we can pull in shifts etc. - Update arm64-fastisel-gep-promote-before-add.ll to use FastISel instead of GISel. This test failed with GISel because GISel folds the gep into the load. The test checks that FastISel doesn't fold non-pointer-width adds into loads. GISel on the other hand, produces a G_CONSTANT of -128 for the add, and then a G_GEP, which must be pointer-width. Note that we don't get STRBRoX right now. It seems like the importer can't handle `FPR8Op:{ *:[Untyped] }:$Rt` source operands. So, those are not currently supported. Differential Revision: https://reviews.llvm.org/D66679 llvm-svn: 369806
* [GlobalISel] Legalizer: Retry combining illegal artifacts as long as there ↵Volkan Keles2019-08-231-3/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | new artifacts Summary: Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s possible to combine them after processing the rest of the instruction because the legalization is likely to generate more artifacts that allow ArtifactCombiner to combine away them. Instead, move illegal artifacts to another list called RetryList and wait until all of the instruction in InstList are legalized. After that, check if there is any new artifacts and try to combine them again if that’s the case. If not, abort. The idea is similar to D59339, but the approach is a bit different. This patch fixes the issue described above, but the legalizer still may be unable to handle some cases depending on when to legalize artifacts. So, in the long run, we probably need a different legalization strategy that handles this dependency in a better way. Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette Reviewed By: dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65894 llvm-svn: 369805
* [Attributor] Manifest alignment in load and store instructionsJohannes Doerfert2019-08-231-0/+50
| | | | | | | | | | | | | | | | Summary: We can now manifest alignment information in load/store instructions if the pointer is known to have a better alignment. Reviewers: uenoku, sstefan1, lebedev.ri Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66567 llvm-svn: 369804
* Do a sweep of symbol internalization. NFC.Benjamin Kramer2019-08-2316-33/+39
| | | | llvm-svn: 369803
* [X86] Move a transform out of combineConcatVectorOps so we don't prematurely ↵Craig Topper2019-08-231-9/+13
| | | | | | | | | | | | turn CONCAT_VECTORS into INSERT_SUBVECTORS. CONCAT_VECTORS and INSERT_SUBVECTORS can both call combineConcatVectorOps, but we shouldn't produce INSERT_SUBVECTORS from there. We should keep CONCAT_VECTORS until vector legalization. Noticed while looking at the madd_quad_reduction test from madd.ll llvm-svn: 369802
* [SampleFDO] Add ExtBinary format to support extension of binary profile.Wei Mi2019-08-232-29/+258
| | | | | | | | | | | | This is a patch split from https://reviews.llvm.org/D66374. It tries to add a new format of profile called ExtBinary. The format adds a section header table to the profile and organize the profile in sections, so the future extension like adding a new section or extending an existing section will be easier while keeping backward compatiblity feasible. Differential Revision: https://reviews.llvm.org/D66513 llvm-svn: 369798
* [PowerPC] Expand v1i128 sminRoland Froese2019-08-231-4/+12
| | | | | | | | | The smin opcode and friends for v1i128 are incorrectly marked as legal for PPC. Change them to expand. Differential Revision: https://reviews.llvm.org/D64960 llvm-svn: 369797
* Fix a bug in just submitted rL369789Philip Reames2019-08-231-1/+4
| | | | | | | | Started implementing the vector case and realized the scalar case hadn't handled the GEP producing a different type than the base correctly. It's entertaining seeing what slips through review when we're focused on the 'hard' parts. :( Also adding an extra vector test as it happened to be in workspace and wasn't worth separating. llvm-svn: 369795
* RegScavenger: Use RegisterMatt Arsenault2019-08-231-17/+17
| | | | llvm-svn: 369794
* [X86] Mark VPDPWSSD and VPDPWSSDS as commutable. Add stack folding tests.Craig Topper2019-08-232-10/+33
| | | | llvm-svn: 369792
* Revert r369233.Manoj Gupta2019-08-231-14/+10
| | | | | | | | This breaks building of some projects like libfuse and alsa-lib that now fail when linking. Error details in PR43092. llvm-svn: 369790
* [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, nullPhilip Reames2019-08-231-0/+21
| | | | | | | | | | This generalizes the isGEPKnownNonNull rule from ValueTracking to apply when we do not know if the base is non-null, and thus need to replace one condition with another. The core notion is that since an inbounds GEP can only form null if the base pointer is null and the offset is zero. However, if the offset is non-zero, the the "inbounds" marker makes the result poison. Thus, we're free to ignore the case where the offset is non-zero. Similarly, there's no case under which a non-null base can result in a null result without generating poison. Differential Revision: https://reviews.llvm.org/D66608 llvm-svn: 369789
* [BasicAA] Use dereferenceability to reason about aliasingJohannes Doerfert2019-08-231-4/+26
| | | | | | | | | | | | | | | | | | | | | Summary: We already use the fact that an object with known size X does not alias another objection of size Y > X before. With this commit, we use dereferenceability information to determine a lower bound for Y and not only rely on the user provided query size. The result for @global_and_deref_arg_2() and @local_and_deref_ret_2() in test/Analysis/BasicAA/dereferenceable.ll improved with this patch. Reviewers: asbirlea, chandlerc, hfinkel, sanjoy Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66157 llvm-svn: 369786
* [Attributor] Manifest constant return valuesJohannes Doerfert2019-08-231-1/+25
| | | | | | | | | | | | | | | | Summary: If the unique return value is a constant we now replace call uses with that constant. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66551 llvm-svn: 369785
* [Attributor] Deal with shrinking dereferenceability in a loopJohannes Doerfert2019-08-231-6/+17
| | | | | | | | | | | | | | | | | | Summary: If we have a loop in which the dereferenceability of a pointer decreases we did slowly decrease it iteration by iteration, leading to a timeout. With this patch we detect such circular reasoning and indicate a fixpoint early. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66558 llvm-svn: 369784
* Allow Compiler.h to be included in C files and fix fallthrough warningsNathan Huckleberry2019-08-231-3/+4
| | | | | | | | | | | | | | | | | | | | | Summary: Since clang does not support comment style fallthrough annotations these should be switched to macros defined in Compiler.h. This requires some fixing to Compiler.h. Original patch: https://reviews.llvm.org/D66487 Reviewers: nickdesaulniers, aaron.ballman, xbolva00, rsmith Reviewed By: nickdesaulniers, aaron.ballman, rsmith Subscribers: rsmith, sfertile, ormris, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66609 llvm-svn: 369782
* [SelectionDAG][X86] Enable iX SimplifyDemandedBits to vXi1 ↵Craig Topper2019-08-232-3/+21
| | | | | | | | | | | | | | | | SimplifyDemandedVectorElts simplification. Add a hack to X86 to avoid a regression Patch showing the effect of enabling bool vector oversimplification. Non-VLX builds can simplify a kshift shuffle, but VLX builds simplify: insert_subvector v8i zeroinitializer, v2i --> insert_subvector v8i undef, v2i Preventing the removal of the AND to clear the upper bits of result Differential Revision: https://reviews.llvm.org/D53022 llvm-svn: 369780
* [DebugInfo] Remove invalidated locations during LiveDebugValuesJeremy Morse2019-08-231-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | LiveDebugValues gives variable locations to blocks, but it should also take away. There are various circumstances where a variable location is known until a loop backedge with a different location is detected. In those circumstances, where there's no agreement on the variable location, it should be undef / removed, otherwise we end up picking a location that's valid on some loop iterations but not others. However, LiveDebugValues doesn't currently do this, see the new testcase attached. Without this patch, the location of !3 is assumed to be %bar through the loop. Once it's added to the In-Locations list, it's never removed, even though the later dbg.value(0... of !3 makes the location un-knowable. This patch checks during block-location-joining to see whether any previously-present locations have been removed in a predecessor. If they have, the live-ins have changed, and the block needs reprocessing. Similarly, in transferTerminator, assign rather than |= the Out-Locations after processing a block, as we may have deleted some previously valid locations. This will mean that LiveDebugValues performs more propagation -- but that's necessary for it being correct. Differential Revision: https://reviews.llvm.org/D66599 llvm-svn: 369778
* [SLP] use range-for loops, fix formatting; NFCSanjay Patel2019-08-231-32/+32
| | | | | | | These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369776
* [Reassoc] Small fix to support unary FNeg in NegateValue(...)Cameron McInally2019-08-231-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D66612 llvm-svn: 369772
* [Attributor][Fix] Deal with "growing" dereferenceabilityJohannes Doerfert2019-08-231-0/+6
| | | | | | | | | | | | | | | | | | Summary: If we have a negative inbounds offset dereferenceabily "grows". However, until we do not handle the overflow that can occur in the dereferenceable bytes and the problem with loops, we simply do not grow the state. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66557 llvm-svn: 369771
* [Attributor][NFCI] Avoid lookups when resolving returned valuesJohannes Doerfert2019-08-231-3/+16
| | | | | | | | | | If the number of potentially returned values not change since the last traversal we do not need to visit the returned values again. This works as we only add values to the returned values set now. Differential Revision: https://reviews.llvm.org/D66484 llvm-svn: 369770
* [SLP] fix formatting; NFCSanjay Patel2019-08-231-4/+3
| | | | | | | These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369769
* [Attributor] FIX: Treat new attributes as changed onesJohannes Doerfert2019-08-231-5/+5
| | | | | | | | | | | | | | | | | | | | | Summary: When we have new attributes and we end the fixpoint iteration because the iteration limit is reached, we need to treat the new ones as if they changed in the last iteration, as they might have. This adds a test for which we should not derive anything regardless of the iteration limit, e.g., if we abort there should not be any attributes manifested in the IR. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66549 llvm-svn: 369768
* [Attributor][NFCI] Try to avoid potential non-deterministic behaviorJohannes Doerfert2019-08-231-12/+10
| | | | | | | This commit replaces sets with set vectors in an effort to make the behavior of the Attributor deterministic. llvm-svn: 369767
* [ThinLTO] Fix handling of weak interposable symbolsTeresa Johnson2019-08-233-38/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Keep aliasees alive if their alias is live, otherwise we end up with an alias to a declaration, which is invalid. This can happen when the aliasee is weak and non-prevailing. This fix exposed the fact that we were then attempting to internalize the weak symbol, which was not exported as it was not prevailing. We should not internalize interposable symbols in general, unless this is the prevailing copy, since it can lead to incorrect inlining and other optimizations. Most of the changes in this patch are due to the restructuring required to pass down the prevailing callback. Finally, while implementing the test cases, I found that in the case of a weak aliasee that is still marked not live because its alias isn't live, after dropping the definition we incorrectly marked the declaration with weak linkage when resolving prevailing symbols in the module. This was due to some special case handling for symbols marked WeakLinkage in the summary located before instead of after a subsequent check for the symbol being a declaration. It turns out that we don't actually need this special case handling any more (looking back at the history, when that was added the code was structured quite differently) - we will correctly mark with weak linkage further below when the definition hasn't been dropped. Fixes PR42542. Reviewers: pcc Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66264 llvm-svn: 369766
* [MustExec] Add a generic "must-be-executed-context" explorerJohannes Doerfert2019-08-232-0/+119
| | | | | | | | | | | | | | | | Given an instruction I, the MustBeExecutedContextExplorer allows to easily traverse instructions that are guaranteed to be executed whenever I is. For now, these instruction have to be statically "after" I, in the same or different basic blocks. This patch also adds a pass which prints the must-be-executed-context for each instruction in a module. It is used to test the MustBeExecutedContextExplorer, for now on the examples given in the class comment of the MustBeExecutedIterator. Differential Revision: https://reviews.llvm.org/D65186 llvm-svn: 369765
* [mips] Reduce number of instructions used for loading a global symbol's valueSimon Atanasyan2019-08-231-5/+13
| | | | | | | | | | | | | | | | | | | | | | Now `lw/sw $reg, sym+offset` pseudo instructions for global symbol `sym` are lowering into the following three instructions. ``` lw $reg, %got(symbol)($gp) addiu $reg, $reg, offset lw/sw $reg, 0($reg) ``` It's possible to reduce the number of instructions by taking the offset in account in the final `lw/sw` command. This patch implements that optimization. ``` lw $reg, %got(symbol)($gp) lw/sw $reg, offset($reg) ``` Differential Revision: https://reviews.llvm.org/D66553 llvm-svn: 369756
* [mips] Do not include offset into `%got` expression for global symbolsSimon Atanasyan2019-08-231-14/+18
| | | | | | | | | | | | | | | | | | | | | | | | Now pseudo instruction `la $6, symbol+8($6)` is expanding into the following chain of commands: ``` lw $1, %got(symbol+8)($gp) addiu $1, $1, 8 addu $6, $1, $6 ``` This is incorrect. When a linker handles the `R_MIPS_GOT16` relocation, it does not expect to get any addend and breaks on assertion. Otherwise it has to create new GOT entry for each unique "sym + offset" pair. Offset for a global symbol should be added to result of loading GOT entry by a separate `add` command. The patch fixes the problem by stripping off an offset from the expression passed to the `%got`. That's interesting that even current code inserts a separate `add` command. Differential Revision: https://reviews.llvm.org/D66552 llvm-svn: 369755
* Use VT::getHalfNumVectorElementsVT helpers in a few places. NFCI.Simon Pilgrim2019-08-232-10/+6
| | | | llvm-svn: 369751
* [X86][BtVer2] Add a read-advance to every implicit register use of ↵Andrea Di Biagio2019-08-231-4/+10
| | | | | | | | | | | | CMPXCHG8B/16B. This is a follow up of r369642. This patch assigns a ReadAfterLd to every implicit register use of instruction CMPXCHG8B and instruction CMPXCHG16B. Perf micro-benchmarks show that implicit registers are read after 3cy from the start of execution. llvm-svn: 369750
* [X86][BtVer2] Fix latency of ALU RMW instructions.Andrea Di Biagio2019-08-231-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Excluding ADC/SBB and the bit-test instructions (BTR/BTS/BTC), the observed latency of all other RMW integer arithmetic/logic instructions is 6cy and not 5cy. Example (ADD): ``` addb $0, (%rsp) # Latency: 6cy addb $7, (%rsp) # Latency: 6cy addb %sil, (%rsp) # Latency: 6cy addw $0, (%rsp) # Latency: 6cy addw $511, (%rsp) # Latency: 6cy addw %si, (%rsp) # Latency: 6cy addl $0, (%rsp) # Latency: 6cy addl $511, (%rsp) # Latency: 6cy addl %esi, (%rsp) # Latency: 6cy addq $0, (%rsp) # Latency: 6cy addq $511, (%rsp) # Latency: 6cy addq %rsi, (%rsp) # Latency: 6cy ``` The same latency profile applies to SUB/AND/OR/XOR/INC/DEC. The observed latency of ADC/SBB is 7-8cy. So we need a different write to model those. Latency of BTS/BTR/BTC is not fixed by this patch (they are much slower than what the model for btver2 currently reports). Differential Revision: https://reviews.llvm.org/D66636 llvm-svn: 369748
* [llvm-dlltool] Make sure to strip decorations from ExtName for renamed exportsMartin Storsjo2019-08-231-0/+1
| | | | | | | | | | | | ExtName should not be decorated, just like Name. This avoids double decoration on symbols in import libraries that use = for renaming functions. (Weak aliases, which use ==, worked fine with respect to decoration.) Differential Revision: https://reviews.llvm.org/D66617 llvm-svn: 369747
* [DAGCombine] GetNegatedExpression - add FMA\FMAD supportSimon Pilgrim2019-08-231-1/+52
| | | | | | | | If the accumulator and either of the multiply operands are negatable then we can we negate the entire expression. Differential Revision: https://reviews.llvm.org/D63141 llvm-svn: 369746
* [AMDGPU] gfx10 atomic optimizer changes.Jay Foad2019-08-233-57/+152
| | | | | | | | | | | | | | | | Summary: Add support for gfx10, where all DPP operations are confined to work within a single row of 16 lanes, and wave32. Reviewers: arsenm, sheredom, critson, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, jfb, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65644 llvm-svn: 369745
OpenPOWER on IntegriCloud