summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r335562 and 335563 "[X86] Redefine avx512 packed fpclass intrinsics ↵Craig Topper2018-06-262-49/+19
| | | | | | | | to return a vXi1 mask and implement the mask input argument using an 'and' IR instruction." These were supposed to have been squashed to a single commit. llvm-svn: 335566
* [ORC] Add a symbolAliases function to the Core APIs.Lang Hames2018-06-261-9/+80
| | | | | | symbolAliases can be used to define symbol aliases within a VSO. llvm-svn: 335565
* fooCraig Topper2018-06-262-19/+49
| | | | llvm-svn: 335562
* [ThinLTO] Compute GUID directly from GV when building per-module indexTeresa Johnson2018-06-262-8/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: I discovered when writing the summary parsing support that the per-module index builder and writer are computing the GUID from the value name alone (ignoring the linkage type). This was ok since those GUID were not emitted in the bitcode, and there are never multiple conflicting names in a single module. However, I don't see a reason for making the GUID computation different for the per-module case. It also makes things simpler on the parsing side to have the GUID computation consistent. So this patch changes the summary analysis phase and the per-module summary writer to compute the GUID using the facility on the GlobalValue. Reviewers: pcc, dexonsmith Subscribers: llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D47844 llvm-svn: 335560
* Add a warning if someone attempts to add extra section flags to sectionsEric Christopher2018-06-251-16/+33
| | | | | | with well defined semantics like .rodata. llvm-svn: 335558
* [APInt] Add helpers for rounding u/sdivs.Tim Shen2018-06-251-0/+46
| | | | | | | | | | Reviewers: sanjoy, craig.topper Subscribers: jlebar, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D48498 llvm-svn: 335557
* [PM/LoopUnswitch] Teach the new unswitch to handle nontrivialChandler Carruth2018-06-251-156/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unswitching of switches. This works much like trivial unswitching of switches in that it reliably moves the switch out of the loop. Here we potentially clone the entire loop into each successor of the switch and re-point the cases at these clones. Due to the complexity of actually doing nontrivial unswitching, this patch doesn't create a dedicated routine for handling switches -- it would duplicate far too much code. Instead, it generalizes the existing routine to handle both branches and switches as it largely reduces to looping in a few places instead of doing something once. This actually improves the results in some cases with branches due to being much more careful about how dead regions of code are managed. With branches, because exactly one clone is created and there are exactly two edges considered, somewhat sloppy handling of the dead regions of code was sufficient in most cases. But with switches, there are much more complicated patterns of dead code and so I've had to move to a more robust model generally. We still do as much pruning of the dead code early as possible because that allows us to avoid even cloning the code. This also surfaced another problem with nontrivial unswitching before which is that we weren't as precise in reconstructing loops as we could have been. This seems to have been mostly harmless, but resulted in pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we have to get this *right*, and everything benefits from it. While the testing may seem a bit light here because we only have two real cases with actual switches, they do a surprisingly good job of exercising numerous edge cases. Also, because we share the logic with branches, most of the changes in this patch are reasonably well covered by existing tests. The new unswitch now has all of the same fundamental power as the old one with the exception of the single unsound case of *partial* switch unswitching -- that really is just loop specialization and not unswitching at all. It doesn't fit into the canonicalization model in any way. We can add a loop specialization pass that runs late based on profile data if important test cases ever come up here. Differential Revision: https://reviews.llvm.org/D47683 llvm-svn: 335553
* [InstCombine] cleanup udiv folds; NFCISanjay Patel2018-06-251-30/+20
| | | | | | | | | This removes a "UDivFoldAction" in favor of a simple constant matcher. In theory, the existing code could do more matching, but I don't see any evidence or need for it. I've left a TODO about using ValueTracking in case we see any regressions. llvm-svn: 335545
* [Instrumentation] Remove unused includeBenjamin Kramer2018-06-251-1/+0
| | | | | | It's also a layering violation. llvm-svn: 335528
* [InstCombine] fold sdiv with sext bool divisorSanjay Patel2018-06-251-4/+7
| | | | llvm-svn: 335527
* Revert r335513: [SCEVExp] Advance found insertion pointFlorian Hahn2018-06-251-4/+3
| | | | llvm-svn: 335522
* [LoopIdiomRecognize] Fix a couple places where it appears we were ↵Craig Topper2018-06-251-6/+5
| | | | | | unintenionally making copies of DebugLoc. llvm-svn: 335521
* [X86] Simplify intrinsic table binary search to not require a temporary struct.Craig Topper2018-06-251-10/+9
| | | | | | std::lower_bound doesn't require the thing to search for to be the same type as the table entries. We just need to define an appropriate comparison function that can take an table entry and an intrinsic number. llvm-svn: 335518
* [X86] Add comment about the sorting of the memory folding tables added in ↵Craig Topper2018-06-251-0/+15
| | | | | | r335501. llvm-svn: 335517
* [PowerPC] Fix incorrectly encoded wait instructionLei Huang2018-06-251-1/+1
| | | | | | | | Encoding for the wait instruction was wrong. Fix according to ISA 3.0. Differential Revision: https://reviews.llvm.org/D48550 llvm-svn: 335514
* [SCEVExp] Advance found insertion point until we find a non-dbg instruction.Florian Hahn2018-06-251-3/+4
| | | | | | | | | | | | | This avoids creating unnecessary casts if the IP used to be a dbg info intrinsic. Fixes PR37727. Reviewers: vsk, aprantl, sanjoy, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D47874 llvm-svn: 335513
* [InstSimplify] fold div/rem of zexted boolSanjay Patel2018-06-251-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I was looking at an unrelated fold and noticed that we don't have this simplification (because the other fold would break existing tests). Name: zext udiv %z = zext i1 %x to i32 %r = udiv i32 %y, %z => %r = %y Name: zext urem %z = zext i1 %x to i32 %r = urem i32 %y, %z => %r = 0 Name: zext sdiv %z = zext i1 %x to i32 %r = sdiv i32 %y, %z => %r = %y Name: zext srem %z = zext i1 %x to i32 %r = srem i32 %y, %z => %r = 0 https://rise4fun.com/Alive/LZ9 llvm-svn: 335512
* Handle NetBSD specific path in findDebugBinary()Kamil Rytarowski2018-06-251-0/+5
| | | | | | | | | | | | | | | | | | | | Summary: The NetBSD Operating System installs debuginfo files into /usr/libdata/debug, rather than other path like in some other popular distribution. This change makes llvm-symbolizer functional with the basesystem executables. Reviewers: joerg, vitalybuka Reviewed By: vitalybuka Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D48525 llvm-svn: 335511
* Re-land r335297 "[X86] Implement more of x86-64 large and medium PIC code ↵Reid Kleckner2018-06-256-30/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | models" The large code model allows code and data segments to exceed 2GB, which means that some symbol references may require a displacement that cannot be encoded as a displacement from RIP. The large PIC model even relaxes the assumption that the GOT itself is within 2GB of all code. Therefore, we need a special code sequence to materialize it: .LtmpN: leaq .LtmpN(%rip), %rbx movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch addq %rax, %rbx # GOT base reg From that, non-local references go through the GOT base register instead of being PC-relative loads. Local references typically use GOTOFF symbols, like this: movq extern_gv@GOT(%rbx), %rax movq local_gv@GOTOFF(%rbx), %rax All calls end up being indirect: movabsq $local_fn@GOTOFF, %rax addq %rbx, %rax callq *%rax The medium code model retains the assumption that the code segment is less than 2GB, so calls are once again direct, and the RIP-relative loads can be used to access the GOT. Materializing the GOT is easy: leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg DSO local data accesses will use it: movq local_gv@GOTOFF(%rbx), %rax Non-local data accesses will use RIP-relative addressing, which means we may not always need to materialize the GOT base: movq extern_gv@GOTPCREL(%rip), %rax Direct calls are basically the same as they are in the small code model: They use direct, PC-relative addressing, and the PLT is used for calls to non-local functions. This patch adds reasonably comprehensive testing of LEA, but there are lots of interesting folding opportunities that are unimplemented. I restricted the MCJIT/eh-lg-pic.ll test to Linux, since the large PIC code model is not implemented for MachO yet. Differential Revision: https://reviews.llvm.org/D47211 llvm-svn: 335508
* [X86] Sort the static memory folding tables by reg opcode. Remove the ↵Craig Topper2018-06-252-5419/+5328
| | | | | | | | | | | | reg->mem DenseMaps in favor of binary search. With the static tables sorted we can binary search them directly for reg->mem lookups. This removes 6 DenseMaps that had to be created when X86InstrInfo is constructed. We still have one Mem->Reg DenseMap for the reverse direction. This is created just as before by walking the reg->mem arrays to populate it. Differential Revision: https://reviews.llvm.org/D48527 llvm-svn: 335501
* [X86] Allow base and index for gather instructions to appear in other order ↵Craig Topper2018-06-251-0/+11
| | | | | | for Intel syntax. llvm-svn: 335500
* [SelectionDAG] Remove debug locations from ConstantSD(FP)NodesVedant Kumar2018-06-251-2/+2
| | | | | | | | | | | | | | | | | | This removes debug locations from ConstantSDNode and ConstantSDFPNode. When this kind of node is materialized we no longer create a line table entry which jumps back to the constant's first point of use. This makes single-stepping behavior smoother, and it matches the model used by IR, where Constants have no locations. See this thread for more context: http://lists.llvm.org/pipermail/llvm-dev/2018-June/124164.html I'd like to handle constant BuildVectorSDNodes and to try to eliminate passing SDLocs to SelectionDAG::getConstant*() in follow-up commits. Differential Revision: https://reviews.llvm.org/D48468 llvm-svn: 335497
* Add Triple::isMIPS()/isMIPS32()/isMIPS64(). NFCAlexander Richardson2018-06-2510-26/+13
| | | | | | | | | | | | | | There are quite a few if statements that enumerate all these cases. It gets even worse in our fork of LLVM where we also have a Triple::cheri (which is mips64 + CHERI instructions) and we had to update all if statements that check for Triple::mips64 to also handle Triple::cheri. This patch helps to reduce our diff to upstream and should also make some checks more readable. Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D48548 llvm-svn: 335493
* AMDGPU/GlobalISel: Add support for llvm.amdgcn.kernarg.segment.ptrMatt Arsenault2018-06-253-1/+29
| | | | | | | | | Note a normal select test is not currently possible because this relies on input registers tracked in SIMachineFunctionInfo which are not currently serializable in MIR, but this does work end-to-end from the IR. llvm-svn: 335490
* StackSlotColoring: Decide colors per stack IDMatt Arsenault2018-06-251-22/+50
| | | | | | | | | | | | | | | | I thought I fixed this in r308673, but that fix was very broken. The assumption that any frame index can be used in place of another was more widespread than I realized. Even when stack slot sharing was disabled, this was still replacing frame index uses with a different ID with a different stack slot. Really fix this by doing the coloring per-stack ID, so all of the coloring logically done in a separate namespace. This is a lot simpler than trying to figure out how to change the color if the stack ID is different. llvm-svn: 335488
* AMDGPU: Remove commented out codeMatt Arsenault2018-06-251-2/+0
| | | | llvm-svn: 335486
* AMDGPU/GlobalISel: Fix G_IMPLICIT_DEF for pointersMatt Arsenault2018-06-251-1/+5
| | | | llvm-svn: 335485
* [SampleFDO] Add an option to turn on/off warning about samples unused.Wei Mi2018-06-251-0/+8
| | | | | | | | | | | | | | | | | | If a function has sample to use, but cannot use them because of no debug information, currently a warning will be issued to inform the missing opportunity. This warning assumes the binary generating the profile and the binary using the profile are similar enough. It is not always the case. Sometimes even if the binaries are not quite similar, we may still get some benefit by using sampleFDO. In those cases, we may still want to apply sampleFDO but not want to see a lot of such warnings pop up. The patch adds an option for the warning. Differential Revision: https://reviews.llvm.org/D48510 llvm-svn: 335484
* [DA] Delinearise AddRecs if we can prove they don't wrapDavid Green2018-06-251-2/+21
| | | | | | | | | | | We can prove that some delinearized subscripts do not wrap around to become negative by the fact that they are from inbound geps of load/store locations. This helps improve the delinearisation in cases where we can't prove that they are non-negative from SCEV alone. Differential Revision: https://reviews.llvm.org/D48481 llvm-svn: 335481
* AMDGPU: Respect align argument parameterMatt Arsenault2018-06-252-10/+19
| | | | | | | | | | This should avoid relying on the pointee type to get the alignment, particularly since pointee types are supposed to be removed at some point. Also fixes not getting the alignment for unsized types. llvm-svn: 335478
* SafepointIRVerifier should ignore dead blocks and dead edgesArtur Pilipenko2018-06-251-28/+189
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not only should SafepointIRVerifier ignore unreachable blocks (as suggested in https://reviews.llvm.org/D47011) but it also has to ignore dead blocks. In @test2 (see the new tests): br i1 true, label %right, label %left left: ... right: ... merge: %val = phi i8 addrspace(1)* [ ..., %left ], [ ..., %right ] use %val both left and right branches are reachable. If they collide then SafepointIRVerifier reports an error. Because of the foldable branch condition GVN finds the left branch dead and removes the phi node entry that merges values from right and left. Then the use comes from the right branch. This results in no collision. So, SafepointIRVerifier ends up in different results depending on either GVN is run or not. To solve this issue this patch adds Dead Block detection to SafepointIRVerifier which can ignore dead blocks while validating IR. The Dead Block detection algorithm is taken from GVN but modified to not split critical edges. That is needed to keep CFG unchanged by SafepointIRVerifier. Patch by Yevgeny Rouban. Reviewed By: anna, apilipenko, DaniilSuchkov Differential Revision: https://reviews.llvm.org/D47441 llvm-svn: 335473
* Improve handling of COPY instructions with identical value numbersKrzysztof Parzyszek2018-06-251-28/+126
| | | | | | | | Testcases provided by Tim Renouf. Differential Revision: https://reviews.llvm.org/D48102 llvm-svn: 335472
* Revert change 335077 "[InlineSpiller] Fix a crash due to lack of forward ↵Artur Pilipenko2018-06-251-26/+0
| | | | | | | | | | progress from remat specifically for STATEPOINT" This change caused widespread assertion failures in our downstream testing: lib/CodeGen/LiveInterval.cpp:409: bool llvm::LiveRange::overlapsFrom(const llvm::LiveRange&, llvm::LiveRange::const_iterator) const: Assertion `!empty() && "empty range"' failed. llvm-svn: 335462
* Use APInt[] bit access to avoid "32-bit shift implicitly converted to 64 ↵Simon Pilgrim2018-06-251-1/+1
| | | | | | bits" MSVC warning (again). NFCI. llvm-svn: 335457
* Use APInt[] bit access to avoid "32-bit shift implicitly converted to 64 ↵Simon Pilgrim2018-06-251-1/+1
| | | | | | bits" MSVC warning. NFCI. llvm-svn: 335454
* Fix -Wparentheses gcc warning. NFCI.Simon Pilgrim2018-06-251-1/+1
| | | | llvm-svn: 335451
* [X86] Block commuting operand 1 of FMA*_Int instructions in ↵Craig Topper2018-06-252-88/+47
| | | | | | | | | | | | findThreeSrcCommutedOpIndices. Remove uncommutable returns from getThreeSrcCommuteCase/getFMA3OpcodeToCommuteOperands. We should be blocking the operand while we are in the routine that tries to find commutable operand indices. Doing it later means we might have missed out on another valid set of operands we could have commuted. The intrinsic case was the only case that could really prevent commuting in getFMA3OpcodeToCommuteOperands. All the other cases in getThreeSrcCommuteCase were not reachable conditions as they were protected by findThreeSrcCommutedOpIndices. With that abort case pushed earlier, we can remove all the abort checks and replace with asserts. llvm-svn: 335446
* [MSSA] Add domination number verifier; NFCGeorge Burgess IV2018-06-251-0/+39
| | | | | | | | | | | It's easy for domination numbers to get out-of-date, and this is no more costly than any of the other verifiers we already have, so it seems nice to have. A stage3 build with this Works On My Machine, so this hasn't caught any bugs... yet. :) llvm-svn: 335444
* [WebAssembly] Add WebAssemblyException information analysisHeejin Ahn2018-06-255-0/+370
| | | | | | | | | | | | | | | | | Summary: A WebAssemblyException object contains BBs that belong to a 'catch' part of the try-catch-end structure. Because CFGSort requires all the BBs within a catch part to be sorted together as it does for loops, this pass calculates the nesting structure of catch part of exceptions in a function. Now this assumes the use of Windows EH instructions. Reviewers: dschuff, majnemer Subscribers: jfb, mgorny, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D44134 llvm-svn: 335439
* [WebAssembly] Add WebAssemblyLateEHPrepare passHeejin Ahn2018-06-255-93/+388
| | | | | | | | | | | | | | | | Summary: Add WebAssemblyLateEHPrepare pass that does several small jobs for exception handling. This runs before CFGSort, and is different from WasmEHPrepare pass that runs before ISel, even though the names are similar. Reviewers: dschuff, majnemer Subscribers: sbc100, jgravelle-google, sunfish, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D46803 llvm-svn: 335438
* [X86] Simplify some code by using isOneConstant. NFCCraig Topper2018-06-251-2/+1
| | | | llvm-svn: 335437
* [X86] Remove the changes to combineScalarToVector made in r335037.Craig Topper2018-06-251-26/+3
| | | | | | They appear to be untested other than the test case for p37879.ll and I believe we should be using SimplifyDemandedElts here to handle these cases. llvm-svn: 335436
* [X86] Reduce the number of patterns needed for masked scalar ceil/floor isel.Craig Topper2018-06-251-35/+10
| | | | | | The scalar to vector on the mask register should not be part of the patterns. llvm-svn: 335435
* [mips][ias] Enable IAS by default for OpenBSD / FreeBSD mips64/mips64el.Brad Smith2018-06-241-0/+5
| | | | | | | | Reviewers: atanasyan Differential Review: https://reviews.llvm.org/D31557 llvm-svn: 335434
* [DAGCombiner] eliminate setcc bool math when input is low-bit of some valueSanjay Patel2018-06-241-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch has the same motivating example as D48466: define void @foo(i64 %x, i32 %c.0282.in, i32 %d.0280, i32* %ptr0, i32* %ptr1) { %c.0282 = and i32 %c.0282.in, 268435455 %a16 = lshr i64 32508, %x %a17 = and i64 %a16, 1 %tobool = icmp eq i64 %a17, 0 %. = select i1 %tobool, i32 1, i32 2 %.286 = select i1 %tobool, i32 27, i32 26 %shr97 = lshr i32 %c.0282, %. %shl98 = shl i32 %c.0282.in, %.286 %or99 = or i32 %shr97, %shl98 %shr100 = lshr i32 %d.0280, %. %shl101 = shl i32 %d.0280, %.286 %or102 = or i32 %shr100, %shl101 store i32 %or99, i32* %ptr0 store i32 %or102, i32* %ptr1 ret void } ...but I'm trying to kill the setcc bool math sooner rather than later. By matching a larger pattern that includes both the low-bit mask and the trailing add/sub, we can create a universally good fold because we always eliminate the condition code intermediate value. Here are Alive proofs for these (currently instcombine folds the 'add' variants, but misses the 'sub' patterns): https://rise4fun.com/Alive/Gsyp Name: sub of zext cmp mask %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %z = zext i1 %c to i32 %r = sub i32 C1, %z => %optional_cast = zext i8 %a to i32 %r = add i32 %optional_cast, C1-1 Name: add of zext cmp mask %a = and i32 %x, 1 %c = icmp eq i32 %a, 0 %z = zext i1 %c to i8 %r = add i8 %z, C1 => %optional_cast = trunc i32 %a to i8 %r = sub i8 C1+1, %optional_cast All of the tests look like improvements or neutral to me. But it is possible that x86 test+set+bitop is better than what we now show here. I suspect we could do better by adding another fold for the 'sub' variants. We start with select-of-constant in IR in the larger motivating test, so that's why I included tests with selects. Proofs for those variants: https://rise4fun.com/Alive/Bx1 Name: true const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C2, i64 C1 => %z = zext i8 %a to i64 %r = sub i64 C2, %z Name: false const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C1, i64 C2 => %z = zext i8 %a to i64 %r = add i64 C1, %z Differential Revision: https://reviews.llvm.org/D48466 llvm-svn: 335433
* [X86] Regroup some isel patterns. NFCCraig Topper2018-06-241-32/+22
| | | | | | For some reason the 64-bit patterns were separated from their 8/16/32-bit friends, but only for add/sub/mul. For and/or/xor they were together. llvm-svn: 335429
* [X86] Rename VFPCLASSSS and VFPCLASSSD internal instruction names to include ↵Craig Topper2018-06-243-14/+14
| | | | | | a Z to match other EVEX instructions. llvm-svn: 335428
* Add OpenBSD support to the Threading codeBrad Smith2018-06-231-3/+5
| | | | llvm-svn: 335426
* ADT: Use EBO to shrink SmallVector size 1Duncan P. N. Exon Smith2018-06-231-0/+4
| | | | | | | SmallVectorStorage is empty when its size is 1; use inheritance so that the empty base class optimization kicks in. llvm-svn: 335421
* [TableGen] Use WithColor for printing errors/warningsJonas Devlieghere2018-06-231-6/+3
| | | | | | Use the WithColor helper from support to print errors and warnings. llvm-svn: 335415
OpenPOWER on IntegriCloud