summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Adds support for writing the .bss section for XCOFF object files.Sean Fertile2019-08-201-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target object writer. Also adds a class to represent the top-level sections, which we materialize in the ObjectWriter. executePostLayoutBinding will map all csects into the appropriate container depending on its storage mapping class, and map all symbols into their containing csect. Once all symbols have been processed we - Assign addresses and symbol table indices. - Calaculte section sizes. - Build the section header table. - Assign the sections raw-pointer value for non-virtual sections. Since the .bss section is virtual, writing the header table is enough to add support. Writing of a sections raw data, or of any relocations is not included in this patch. Testing is done by dumping the section header table, but it needs to be extended to include dumping the symbol table once readobj support for dumping auxiallary entries lands. Differential Revision: https://reviews.llvm.org/D65159 llvm-svn: 369454
* [GlobalISel] Handle multiple registers in dbg.value intrinsicAditya Nandakumar2019-08-201-6/+7
| | | | | | | | | | | | | | | | https://reviews.llvm.org/D66077 The value passed into dbg.value may relate to multiple registers, each of which need a DBG_VALUE. This fix calls MIRBuilder.buildDirectDbgValue for each register. Without this, IR passed in from flang-compiler/flang may fail an assertion in getOrCreateVReg. Patch by : peterwaller-arm. llvm-svn: 369403
* [CodeGen] Add a pass to do block predication on SSA machine IR.Thomas Raoux2019-08-202-49/+285
| | | | | | | | | | | | | For targets requiring aggressive scheduling and/or software pipeline we need to apply predication before preRA scheduling. This adds a pass re-using the early if-cvt infrastructure but generating predicated instructions instead of speculatively executing instructions. It allows doing if conversion on blocks containing instructions with side-effects. The pass re-use the target hook from postRA if-conversion to let the target decide on the heuristic to apply. Differential Revision: https://reviews.llvm.org/D66190 llvm-svn: 369395
* [AsmPrinter] Remove const qualifier from EmitBasicBlockStart.Karl-Johan Karlsson2019-08-201-1/+1
| | | | | | | | | | | | | | | | Overriders may want to modify state in it. AMDGPU wants to, but has to make its members mutable in order to do so. Besides, EmitBasicBlockEnd is not const, so why should Start be? Patch by Bevin Hansson. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D66341 llvm-svn: 369325
* Fixed placement of llvm.global_dtors on Windows.Vyacheslav Zakharin2019-08-191-1/+2
| | | | | | Differential revision: https://reviews.llvm.org/D66373 llvm-svn: 369299
* [CGP] Remove ModifiedDT from the makeBitReverse loopCraig Topper2019-08-191-1/+0
| | | | | | | | | | I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag. This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization. Differential Revision: https://reviews.llvm.org/D66366 llvm-svn: 369283
* [TargetLowering] x s% C == 0 fold: vector divisor with INT_MIN handlingRoman Lebedev2019-08-191-13/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The general fold is only valid for positive divisors. Which effectively means, it is invalid for `INT_MIN` divisors, and we currently bailout if we see them. But that is too strict, we can just fix-up the results. For that, let's do a second computation 'in parallel': ``` Name: srem -> and Pre: isPowerOf2(C) %o = srem i8 %X, C %r = icmp eq %o, 0 => %n = and i8 %X, C-1 %r = icmp eq %n, 0 ``` https://rise4fun.com/Alive/Sup And then just blend results: if the divisor was `INT_MIN`, pick the value we got via bit-test, else pick the value from general fold. There's interesting observation - `ISD::ROTR` is set to `LegalizeAction::Expand` before AVX512, so we should not treat `INT_MIN` divisor as even; and as it can be seen while `@test_srem_odd_even_one` improves on all run-lines, `@test_srem_odd_even_INT_MIN` only improves for AVX512. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66300 llvm-svn: 369268
* [PeepholeOptimizer] Don't assume bitcast def always has inputJinsong Ji2019-08-191-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If we have a MI marked with bitcast bits, but without input operands, PeepholeOptimizer might crash with assert. eg: If we apply the changes in PPCInstrVSX.td as in this patch: [(set v4i32:$XT, (bitconvert (v16i8 immAllOnesV)))]>; We will get assert in PeepholeOptimizer. ``` llvm-lit llvm-project/llvm/test/CodeGen/PowerPC/build-vector-tests.ll -v llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:417: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. ``` The fix is to abort if we found out of bound access. Reviewers: qcolombet, MatzeB, hfinkel, arsenm Reviewed By: qcolombet Subscribers: wdng, arsenm, steven.zhang, wuzish, nemanjai, hiraditya, kbarton, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65542 llvm-svn: 369261
* [DebugInfo] Allow bundled calls in the MIR's call site infoDavid Stenberg2019-08-192-4/+5
| | | | | | | | | | | | | | | | | | Summary: Extend the MIR parser and writer so that the call site information can refer to calls that are bundled. Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk Reviewed By: aprantl Subscribers: arsenm, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D66145 llvm-svn: 369256
* [DebugInfo] Make postra sinking of DBG_VALUEs subregister-safeJeremy Morse2019-08-191-9/+28
| | | | | | | | | | | | | | | | | | | | | | | | Currently the machine instruction sinker identifies DBG_VALUE insts that also need to sink by comparing register numbers. Unfortunately this isn't safe, because (after register allocation) a DBG_VALUE may read a register that aliases what's being sunk. To fix this, identify the DBG_VALUEs that need to sink by recording & examining their register units. Register units gives us the following guarantee: "Two registers overlap if and only if they have a common register unit" [MCRegisterInfo.h] Thus we can always identify aliasing DBG_VALUEs if the set of register units read by the DBG_VALUE, and the register units of the instruction being sunk, intersect. (MachineSink already uses classes like "LiveRegUnits" for determining sinking validity anyway). The test added checks for super and subregister DBG_VALUE reads of a sunk copy being sunk as well. Differential Revision: https://reviews.llvm.org/D58191 llvm-svn: 369247
* [TargetLowering] Teach computeRegisterProperties to only widen v3i16/v3f16 ↵Craig Topper2019-08-181-11/+23
| | | | | | | | | | | | | | | | | | | | | vectors to the next power of 2 type if that's legal. These were recently made simple types. This restores their behavior back to something like their EVT legalization. We might be able to fix the code in type legalization where the assert was failing, but I didn't investigate too much as I had already looked at the computeRegisterProperties code during the review for v3i16/v3f16. Most of the test changes restore the X86 codegen back to what it looked like before the recent change. The test case in vec_setcc.ll and is a reduced version of the reproducer from the fuzzer. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=16490 llvm-svn: 369205
* [SelectionDAG] Add a node creation debug message to getMachineNode.Craig Topper2019-08-181-0/+1
| | | | llvm-svn: 369204
* [CodeGen] Do the Simple Early Return in block-placement pass to optimize the ↵Kang Zhang2019-08-171-0/+40
| | | | | | | | | | | | | | | | | | blocks Summary: Fix a bug of preducessors. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 369191
* [CodeGenPrepare] Fix use-after-freeSanjay Patel2019-08-161-1/+2
| | | | | | | | | | | | | | | | | | | If OptimizeExtractBits() encountered a shift instruction with no operands at all, it would erase the instruction, but still return false. This previously didn’t matter because its caller would always return after processing the instruction, but https://reviews.llvm.org/D63233 changed the function’s caller to fall through if it returned false, which would then cause a use-after-free detectable by ASAN. This change makes OptimizeExtractBits return true if it removes a shift instruction with no users, terminating processing of the instruction. Patch by: @brentdax (Brent Royal-Gordon) Differential Revision: https://reviews.llvm.org/D66330 llvm-svn: 369168
* Escape % in printf format string.Evgeniy Stepanov2019-08-161-1/+1
| | | | | | Fixes branch-relax-block-size.mir on the ASan builder. llvm-svn: 369138
* [AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask.Amara Emerson2019-08-161-1/+17
| | | | | | | | Again, it's weird that these are allowed. Since lowering support was added in r368709 we started crashing on compiling the neon intrinsics test in the test suite. This fixes the lowering to fold the 1 elt src/mask case into copies. llvm-svn: 369135
* [CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimizationGuozhi Wei2019-08-161-2/+4
| | | | | | | | In function Analysis.cpp:isInTailCallPosition, instructions between call and ret are checked to see if they block tail call optimization. If an instruction is an intrinsic call, only llvm.lifetime_end is allowed and other intrinsic functions block tail call. When compiling tcmalloc, we found llvm.assume between a hot function call and ret, it blocks the optimization. But llvm.assume doesn't generate instructions, it should not block tail call. Differential Revision: https://reviews.llvm.org/D66096 llvm-svn: 369125
* Revert [CodeGen] Do the Simple Early Return in block-placement pass to ↵Florian Hahn2019-08-161-37/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimize the blocks This reverts r368997 (git commit 2a903c0b679bae1919f9fc01f78e4bc6cff2add0) It looks like this commit adds invalid predecessors to MBBs. The example below fails the verifier after MachineBlockPlacement (run llc -verify-machineinstrs): @global.4 = external constant i8* declare i32 @zot(...) define i16* @snork.67() personality i8* bitcast (i32 (...)* @zot to i8*) { bb: invoke void undef() to label %bb5 unwind label %bb4 bb4: ; preds = %bb %tmp = landingpad { i8*, i32 } catch i8* null unreachable bb5: ; preds = %bb %tmp6 = load i32, i32* null, align 4 %tmp7 = icmp eq i32 %tmp6, 0 br i1 %tmp7, label %bb14, label %bb8 bb8: ; preds = %bb11, %bb5 invoke void undef() to label %bb9 unwind label %bb11 bb9: ; preds = %bb8 %tmp10 = invoke i16* undef() to label %bb14 unwind label %bb11 bb11: ; preds = %bb9, %bb8 %tmp12 = landingpad { i8*, i32 } cleanup catch i8* bitcast (i8** @global.4 to i8*) %tmp13 = icmp ult i64 undef, undef br i1 %tmp13, label %bb8, label %bb14 bb14: ; preds = %bb11, %bb9, %bb5 %tmp15 = phi i16* [ null, %bb5 ], [ null, %bb11 ], [ %tmp10, %bb9 ] ret i16* %tmp15 } llvm-svn: 369104
* [DAGCombiner] Add simple folds for SMULFIX/UMULFIX/SMULFIXSATBjorn Pettersson2019-08-161-0/+27
| | | | | | | | | | | | | | | | | | | | | | Summary: Add the following DAGCombiner folds for mulfix being one of SMULFIX/UMULFIX/SMULFIXSAT: (mulfix x, undef, scale) -> 0 (mulfix x, 0, scale) -> 0 Also added canonicalization of constants to RHS. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66052 llvm-svn: 369103
* [DebugInfo] Handle complex expressions with spills in LiveDebugValuesJeremy Morse2019-08-161-31/+27
| | | | | | | | | | | | | | | | | | | | | In r369026 we disabled spill-recognition in LiveDebugValues for anything that has a complex expression. This is because it's hard to recover the complex expression once the spill location is baked into it. This patch re-enables spill-recognition and slightly adjusts the DBG_VALUE insts that LiveDebugValues tracks: instead of tracking the last DBG_VALUE for a variable, it tracks the last _unspilt_ DBG_VALUE. The spill-restore code is then able to access and copy the original complex expression; but the rest of LiveDebugValues has to be aware of the slight semantic shift, and produce a new spilt location if a spilt location is propagated between blocks. The test added produces an incorrect variable location (see FIXME), which will be the subject of future work. Differential Revision: https://reviews.llvm.org/D65368 llvm-svn: 369092
* [GlobalISel] CSEMIRBuilder: Add support for G_GEPVolkan Keles2019-08-153-19/+9
| | | | | | | | | | | | | | | | | | Summary: This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors `translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types. Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66316 llvm-svn: 369070
* Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders2019-08-1565-315/+316
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
* MVT: Add v3i16/v3f16 vectorsMatt Arsenault2019-08-152-1/+6
| | | | | | | | | | | | AMDGPU has some buffer intrinsics which theoretically could use this. Some of the generated tables include the 3 and 4 element vector versions of these rounded to 64-bits, which is ambiguous. Add these to help the table disambiguate these. Assertion change is for the path odd sized vectors now take for R600. v3i16 is widened to v4i16, which then needs to be promoted to v4i32. llvm-svn: 369038
* [NFC] Add a couple of dump routines for RegisterPressure helper classesPhilip Reames2019-08-151-0/+16
| | | | llvm-svn: 369037
* [DebugInfo] Avoid crash from dropped fragments in LiveDebugValuesJeremy Morse2019-08-151-2/+15
| | | | | | | | | | | | | | | | | | | This patch avoids a crash caused by DW_OP_LLVM_fragments being dropped from DIExpressions by LiveDebugValues spill-restore code. The appearance of a previously unseen fragment configuration confuses LDV, as documented in PR42773, and reproduced by the test function this patch adds (Crashes on a x86_64 debug build). To avoid this, on spill restore, we now use fragment information from the spilt-location-expression. In addition, when spilling, we now don't spill any DBG_VALUE with a complex expression, as it can't be safely restored and will definitely lead to an incorrect variable location. The discussion of this is in D65368. Differential Revision: https://reviews.llvm.org/D66284 llvm-svn: 369026
* [llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere2019-08-1532-87/+87
| | | | | | | | Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
* Remove SmallBitVector.h include. NFCI.Simon Pilgrim2019-08-151-1/+0
| | | | | | SmallBitVector/BitVector types aren't used at all in the cpp file. llvm-svn: 369008
* Remove BitVector.h include. NFCI.Simon Pilgrim2019-08-151-1/+0
| | | | | | BitVector type isn't used at all in the cpp file. llvm-svn: 369007
* [DAGCombine] MergeConsecutiveStores - fix cppcheck/MSVC extension warning. NFCI.Simon Pilgrim2019-08-151-1/+1
| | | | | | | | Set the StartIdx type to size_t so that it matches the StoreNodes SmallVector size() and index types. Silences the MSVC analyzer warning that unsigned increment might overflow before exceeding size_t on 64-bit targets - this isn't likely to happen but it means we use consistent types and reduces the warning "noise" a little. llvm-svn: 368998
* [CodeGen] Do the Simple Early Return in block-placement pass to optimize the ↵Kang Zhang2019-08-151-0/+37
| | | | | | | | | | | | | | | | | | blocks Summary: This patch has trigger a bug of r368339, and the r368339 has been reverted, So upstream this patch again. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368997
* [SDAG][x86] check for relaxed math when matching an FP reductionSanjay Patel2019-08-151-2/+15
| | | | | | | | | | | | | | | | If the last step in an FP add reduction allows reassociation and doesn't care about -0.0, then we are free to recognize that computation as a reduction that may reorder the intermediate steps. This is requested directly by PR42705: https://bugs.llvm.org/show_bug.cgi?id=42705 and solves PR42947 (if horizontal math instructions are actually faster than the alternative): https://bugs.llvm.org/show_bug.cgi?id=42947 Differential Revision: https://reviews.llvm.org/D66236 llvm-svn: 368995
* Add ptrmask intrinsicFlorian Hahn2019-08-151-0/+11
| | | | | | | | | | | | | | | | | | | This patch adds a ptrmask intrinsic which allows masking out bits of a pointer that must be zero when accessing it, because of ABI alignment requirements or a restriction of the meaningful bits of a pointer through the data layout. This avoids doing a ptrtoint/inttoptr round trip in some cases (e.g. tagged pointers) and allows us to not lose information about the underlying object. Reviewers: nlopes, efriedma, hfinkel, sanjoy, jdoerfert, aqjune Reviewed by: sanjoy, jdoerfert Differential Revision: https://reviews.llvm.org/D59065 llvm-svn: 368986
* [SelectionDAGBuilder] Teach gather/scatter getUniformBase to look through ↵Craig Topper2019-08-141-2/+7
| | | | | | vector zeroinitializer indices in addition to scalar zeroes. llvm-svn: 368926
* [SDAG] move variable closer to use; NFCSanjay Patel2019-08-141-1/+1
| | | | llvm-svn: 368905
* [DebugInfo] Consider debug label scope has an extra lexical block fileTaewook Oh2019-08-142-3/+7
| | | | | | | | | | | | | | Summary: There are places where a case that debug label scope has an extra lexical block file is not considered properly. The modified test won't pass without this patch. Reviewers: aprantl, HsiangKai Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66187 llvm-svn: 368891
* [DebugInfo] MCP: collect and update DBG_VALUEs encountered in local blockJeremy Morse2019-08-141-12/+26
| | | | | | | | | | | | | | | | | | | | MCP currently uses changeDebugValuesDefReg / collectDebugValues to find debug users of a register, however those functions assume that all DBG_VALUEs immediately follow the specified instruction, which isn't necessarily true. This is going to become very often untrue when we turn off CodeGenPrepare::placeDbgValues. Instead of calling changeDebugValuesDefReg on an instruction to change its debug users, in this patch we instead collect DBG_VALUEs of copies as we iterate over insns, and update the debug users of copies that are made dead. This isn't a non-functional change, because MCP will now update DBG_VALUEs that aren't immediately after a copy, but refer to the same register. I've hijacked the regression test for PR38773 to test for this new behaviour, an entirely new test seemed overkill. Differential Revision: https://reviews.llvm.org/D56265 llvm-svn: 368835
* [AsmPrinter] Delete redundant .type foo, @function when emitting an ifuncFangrui Song2019-08-141-5/+4
| | | | | | | | | | | In MCAsmStreamer: .type foo,@function # <--- this is redundant .type foo,@gnu_indirect_function In MCELFStreamer, the latter STT_GNU_IFUNC overrides STT_FUNC. llvm-svn: 368823
* [GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong ↵Aditya Nandakumar2019-08-141-1/+1
| | | | | | | | source index https://reviews.llvm.org/D66182 llvm-svn: 368781
* [GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sourcesAditya Nandakumar2019-08-131-5/+10
| | | | | | https://reviews.llvm.org/D66171 llvm-svn: 368753
* GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUESMatt Arsenault2019-08-131-0/+62
| | | | | | Odd sized vectors aren't handled yet. llvm-svn: 368713
* GlobalISel: Implement lower for G_SHUFFLE_VECTORMatt Arsenault2019-08-131-0/+40
| | | | llvm-svn: 368709
* GlobalISel: Add more verifier checks for G_SHUFFLE_VECTORMatt Arsenault2019-08-131-1/+35
| | | | llvm-svn: 368705
* GlobalISel: Change representation of shuffle masksMatt Arsenault2019-08-137-2/+86
| | | | | | | | | | | | | | | | | | Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704
* [CodeGen][SelectionDAG] More efficient code for X % C == 0 (SREM case)Roman Lebedev2019-08-131-5/+221
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. One huge caveat: this signed case is only valid for positive divisors. While we can freely negate negative divisors, we can't negate `INT_MIN`, so for now if `INT_MIN` is encountered, we bailout. As a follow-up, it should be possible to handle that more gracefully via extra `and`+`setcc`+`select`. This passes llvm's test-suite, and from cursory(!) cross-examination the folds (the assembly) match those of GCC, and manual checking via alive did not reveal any issues (other than the `INT_MIN` case) Reviewers: RKSimon, spatel, hermord, craig.topper, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: xbolva00, thakis, javed.absar, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65366 llvm-svn: 368702
* [TargetLowering][NFC] prepareUREMEqFold(): fixup commentRoman Lebedev2019-08-131-1/+1
| | | | | | | | The comment initially matched the code, but the code was incorrect and was fixed after the initial revert back back when it was introduced, but the comment was never updated. llvm-svn: 368701
* Revert r368276 "[TargetLowering] SimplifyDemandedBits - call ↵Hans Wennborg2019-08-131-11/+0
| | | | | | | | | | | | | | | | | | | | | | SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT" This introduced a false positive MemorySanitizer warning about use of uninitialized memory in a vectorized crc function in Chromium. That suggests maybe something is not right with this transformation. See https://crbug.com/992853#c7 for a reproducer. This also reverts the follow-up commits r368307 and r368308 which depended on this. > This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. > > In particular this helps remove some unnecessary scalar->vector->scalar patterns. > > The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. > > Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368660
* [GlobalISel] Make the InstructionSelector instance non-const, allowing state ↵Amara Emerson2019-08-131-2/+3
| | | | | | | | | | | | | | | | to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652
* [GlobalISel]: Add KnownBits for G_XORAditya Nandakumar2019-08-131-0/+13
| | | | | | https://reviews.llvm.org/D66119 llvm-svn: 368648
* Eliminate implicit Register->unsigned conversions in VirtRegMap. NFCDaniel Sanders2019-08-133-35/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This was mostly an experiment to assess the feasibility of completely eliminating a problematic implicit conversion case in D61321 in advance of landing that* but it also happens to align with the goal of propagating the use of Register/MCRegister instead of unsigned so I believe it makes sense to commit it. The overall process for eliminating the implicit conversions from Register/MCRegister -> unsigned was to: 1. Add an explicit conversion to support genuinely required conversions to unsigned. For example, using them as an index for IndexedMap. Sadly it's not possible to have an explicit and implicit conversion to the same type and only deprecate the implicit one so I called the explicit conversion get(). 2. Temporarily annotate the implicit conversion to unsigned with LLVM_ATTRIBUTE_DEPRECATED to make them visible 3. Eliminate implicit conversions by propagating Register/MCRegister/ explicit-conversions appropriately 4. Remove the deprecation added in 2. * My conclusion is that it isn't feasible as there's too much code to update in one go. Depends on D65678 Reviewers: arsenm Subscribers: MatzeB, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65685 llvm-svn: 368643
* [GISel]: Fix a bug in KnownBits where we should have been using SizeInBitsAditya Nandakumar2019-08-121-1/+1
| | | | | | | | | https://reviews.llvm.org/D66039 We were using getIndexSize instead of getIndexSizeInBits(). Added test case for G_PTRTOINT and G_INTTOPTR. llvm-svn: 368618
OpenPOWER on IntegriCloud