summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [CodeGen] Add dependency printerEvandro Menezes2017-07-121-35/+56
| | | | | | | | Add SDep printer to make debugging sessions more productive. Differential revision: https://reviews.llvm.org/D35144 llvm-svn: 307799
* Add element atomic memmove intrinsicDaniel Neilson2017-07-122-0/+65
| | | | | | | | | | | | | | Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size. Reviewers: eli.friedman, reames, mkazantsev, skatkov Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34884 llvm-svn: 307796
* Enhance synchscope representationKonstantin Zhuravlyov2017-07-1110-41/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this replaces *singlethread* keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
* [CodeGen] Rename DEBUG_TYPE to match passnamesEvandro Menezes2017-07-112-2/+2
| | | | | | | | | Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were absent from https://reviews.llvm.org/rL303921. Differential revision: https://reviews.llvm.org/D35231 llvm-svn: 307719
* Revert Revert [MBP] do not rotate loop if it creates extra branchSerguei Katkov2017-07-111-1/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a second attempt to land this patch. The first one resulted in a crash of clang sanitizer buildbot. The fix is here and regression test is added. This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true Best Exit has next element in chain as successor. Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur, sammccall, chandlerc Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34745 llvm-svn: 307631
* [CGP] Relax a bit restriction for optimizeMemoryInst to extend scopeSerguei Katkov2017-07-111-2/+5
| | | | | | | | | | | | | | | | | | CodeGenPrepare::optimizeMemoryInst contains a check that we do nothing if all instructions combining the address for memory instruction is in the same block as memory instruction itself. However if any of these instruction are placed after memory instruction then address calculation will not be folded to memory instruction. The added test case shows an example. Reviewers: loladiro, spatel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34862 llvm-svn: 307628
* Revert "[DAG] Improve Aliasing of operations to static alloca"Matthias Braun2017-07-101-14/+6
| | | | | | | | | Reverting as it breaks tramp3d-v4 in the llvm test-suite. I added some comments to https://reviews.llvm.org/D33345 about it. This reverts commit r307546. llvm-svn: 307589
* Add DAG argument to canMergeStoresTo NFC.Nirav Dave2017-07-101-7/+9
| | | | llvm-svn: 307583
* [DAG] Improve Aliasing of operations to static allocaNirav Dave2017-07-101-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 307546
* fix typos in comments and error messages; NFCHiroshi Inoue2017-07-103-3/+3
| | | | llvm-svn: 307533
* [X86] Relax an assertion when legalizing vector types.Davide Italiano2017-07-091-0/+4
| | | | | | | | | | | | | | WidenVSELECTAndMask can fold (and it folds in this case) so we get a BUILD_VECTOR of constants as mask. convertMask() seems to work fine when the input is a vector of constants, and we still need to call it to extend/add elements at the end. but the current code just asserts on anything but a SETCC or AND/OR/XOR of 2xSETCC. This change was discussed briefly with Simon Pilgrim, who also suggests we might consider dropping this assertion in the future. Fixes PR33715. llvm-svn: 307508
* Handle ConstantExpr correctly in SelectionDAGBuilderSimon Pilgrim2017-07-092-8/+18
| | | | | | | | | | | | This change fixes a bug in SelectionDAGBuilder::visitInsertValue and SelectionDAGBuilder::visitExtractValue where constant expressions (InsertValueConstantExpr and ExtractValueConstantExpr) would be treated as non-constant instructions (InsertValueInst and ExtractValueInst). This bug resulted in an incorrect memory access, which manifested as an assertion failure in SDValue::SDValue. Fixes PR#33094. Submitted on behalf of @Praetonus (Benoit Vey) Differential Revision: https://reviews.llvm.org/D34538 llvm-svn: 307502
* [FastISel] fix a fallback diagnostic.Igor Breger2017-07-091-1/+2
| | | | | | | | | | | | | | Summary: FastISel was marked as failed in case instruction selection succeeded. Reviewers: qcolombet, zvi, rovka, ab Reviewed By: zvi Subscribers: javed.absar, ab, qcolombet, bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D34438 llvm-svn: 307489
* fix trivial typos; NFCHiroshi Inoue2017-07-091-4/+4
| | | | | | sucessor -> successor llvm-svn: 307488
* [DAGCombiner] use local variable to shorten code; NFCISanjay Patel2017-07-071-36/+31
| | | | llvm-svn: 307429
* [RegAllocFast] Don't insert kill flags of super-register for partial killQuentin Colombet2017-07-071-2/+9
| | | | | | | | | | | | | | | | | When reusing a register for a new definition, the fast register allocator used to insert a kill flag at the previous last use of that register to inform later passes that this register is free between the redef and the last use. However, this may be wrong when subregisters are involved. Indeed, a partially redef would have trigger a kill of the full super register, potentially wrongly marking all the other subregisters as free. Given we don't track which lanes are still live, we cannot set the kill flag in such case. Note: This bug has been latent for about 7 years (r104056). llvmg.org/PR33677 llvm-svn: 307428
* [RegAllocFast] Add the proper initialize method to use the .mir infrastructureQuentin Colombet2017-07-072-0/+3
| | | | | | NFC llvm-svn: 307427
* RegisterScavenging: Fix PR33687Matthias Braun2017-07-071-2/+9
| | | | | | | | | When scavenging for a use in instruction MI, we will reload after that instruction and hence cannot spill uses/defs of this instruction. This fixes http://llvm.org/PR33687 llvm-svn: 307352
* LiveRegUnits: Rename accumulateBackward()->accumulate()Matthias Braun2017-07-072-2/+2
| | | | | | | | | Contrary to the stepForward()/stepBackward() method accumulate() doesn't have a direction as defs, uses and clobbers all have the same effect. Also improve the documentation comment. llvm-svn: 307351
* [MachineVerifier] Add check that tied physregs aren't different.Mikael Holmen2017-07-061-0/+8
| | | | | | | | | | | | | | | | Summary: Added MachineVerifier code to check register ties more thoroughly, especially so that physical registers that are tied are the same. This may help e.g. when creating MIR files. Original patch by Jesper Antonsson Reviewers: stoklund, sanjoy, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D34394 llvm-svn: 307259
* [RegisterCoalescer] Fix for SubRange join unreachableDavid Stuttard2017-07-061-0/+28
| | | | | | | | | | | | | | | | Summary: During remat, some subranges might end up having invalid segments which caused problems for later coalescing. Added in a check to remove segments that are invalidated as part of the remat. See http://llvm.org/PR33524 Subscribers: MatzeB, qcolombet Differential Revision: https://reviews.llvm.org/D34391 llvm-svn: 307247
* [ARM] GlobalISel: Legalize G_FCMP for s32Diana Picus2017-07-061-0/+2
| | | | | | | | | | | | | | | | | | | | | This covers both hard and soft float. Hard float is easy, since it's just Legal. Soft float is more involved, because there are several different ways to handle it based on the predicate: one and ueq need not only one, but two libcalls to get a result. Furthermore, we have large differences between the values returned by the AEABI and GNU functions. AEABI functions return a nice 1 or 0 representing true and respectively false. GNU functions generally return a value that needs to be compared against 0 (e.g. for ogt, the value returned by the libcall is > 0 for true). We could introduce redundant comparisons for AEABI as well, but they don't seem easy to remove afterwards, so we do different processing based on whether or not the result really needs to be compared against something (and just truncate if it doesn't). llvm-svn: 307243
* Fix libcall expansion creating DAG nodes with invalid type post type ↵Vadim Chugunov2017-07-052-12/+25
| | | | | | | | | | | | legalization. If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values. Patch by Jatin Bhateja and Eli Friedman. Differential Revision: https://reviews.llvm.org/D34240 llvm-svn: 307207
* {DAGCombiner] Fold (rot x, 0) -> xSimon Pilgrim2017-07-051-0/+4
| | | | llvm-svn: 307184
* [DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions ↵Andrew Zhogin2017-07-051-0/+19
| | | | | | | | | | into one with combined shift operand. For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize. Differential revision: https://reviews.llvm.org/D12833 llvm-svn: 307179
* [globalisel][tablegen] Finish fixing compile-time regressions by merging the ↵Daniel Sanders2017-07-051-149/+0
| | | | | | | | | | | | | | | | | | | | | | | matcher and emitter state machines. Summary: Also, made a few minor tweaks to shave off a little more cumulative memory consumption: * All rules share a single NewMIs instead of constructing their own. Only one will end up using it. * Use MIs.resize(1) instead of MIs.clear();MIs.push_back(I) and prevent GIM_RecordInsn from changing MIs[0]. Depends on D33764 Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33766 llvm-svn: 307159
* [GlobalISel] Refactor Legalizer helpers for libcallsDiana Picus2017-07-051-16/+20
| | | | | | | | | | We used to have a helper that replaced an instruction with a libcall. That turns out to be too aggressive, since sometimes we need to replace the instruction with at least two libcalls. Therefore, change our existing helper to only create the libcall and leave the instruction removal as a separate step. Also rename the helper accordingly. llvm-svn: 307149
* [MachineIRBuilder] Fix formatting. NFC.Diana Picus2017-07-051-1/+1
| | | | llvm-svn: 307144
* [MachineIRBuilder] Add buildOr helper. NFC.Diana Picus2017-07-051-0/+4
| | | | | | This isn't used anywhere yet, but I need it for a future commit. llvm-svn: 307141
* [GlobalIsel] allow x86_fp80 values to be dumped.Igor Breger2017-07-051-0/+8
| | | | | | | | | | | | | | | | Summary: Otherwise the fallback path fails with an assertion on x86_64 targets, when "x86_fp80" is encountered. Reviewers: t.p.northover, zvi, guyblank Reviewed By: zvi Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34975 llvm-svn: 307140
* [MachineIRBuilder] Add buildBinaryOp helper. NFCDiana Picus2017-07-051-29/+11
| | | | | | | Add a helper for building simple binary ops like add, mul, sub, and. This can be used in the future for quickly adding support for or, xor. llvm-svn: 307139
* [globalisel][tablegen] Fix an unused variable warning in release builds ↵Daniel Sanders2017-07-051-1/+1
| | | | | | after r307133 llvm-svn: 307138
* [globalisel][tablegen] Added instruction emission to the state-machine-based ↵Daniel Sanders2017-07-051-0/+153
| | | | | | | | | | | | | | | | | | | | | | | matcher. Summary: This further improves the compile-time regressions that will be caused by a re-commit of r303259. Also added included preliminary work in preparation for the multi-insn emitter since I needed to change the relevant part of the API for this patch anyway. Depends on D33758 Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33764 llvm-svn: 307133
* Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffsetNirav Dave2017-07-052-45/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | Relanding after rewriting undef.ll test to avoid host-dependant endianness. As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 307114
* fix trivial typos in comments; NFCHiroshi Inoue2017-07-041-1/+1
| | | | llvm-svn: 307094
* [DAGCombiner] Intermediate variables in visitRotate promoted to the ↵Andrew Zhogin2017-07-041-6/+9
| | | | | | function's begin. NFC precommit for D12833. llvm-svn: 307091
* [FastISel][SelectionDAG]Teach fastISel about GC intrinsicsAnna Thomas2017-07-041-1/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: We are crashing in LLC at O0 when gc intrinsics are present in the block. The reason being FastISel performs basic block ISel by modifying GC.relocates to be the first instruction in the block. This can cause us to visit the GC relocate before it's corresponding GC.statepoint is visited, which is incorrect. When we lower the statepoint, we record the base and derived pointers, along with the gc.relocates. After this we can visit the gc.relocate. This patch avoids fastISel from incorrectly creating the block with gc.relocate as the first instruction. Reviewers: qcolombet, skatkov, qikon, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34421 llvm-svn: 307084
* [globalisel][tablegen] Partially fix compile-time regressions by converting ↵Daniel Sanders2017-07-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | matcher to state-machine(s) Summary: Replace the matcher if-statements for each rule with a state-machine. This significantly reduces compile time, memory allocations, and cumulative memory allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is recommitted. The following patches will expand on this further to fully fix the regressions. Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33758 llvm-svn: 307079
* [DAG] Fixed predicate for determining when two frame indicesNirav Dave2017-07-041-5/+5
| | | | | | addresses are comparable. NFCI. llvm-svn: 307055
* [legalize-types] Clean up softening machinery.Anton Yartsev2017-07-044-42/+89
| | | | | | | | The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265. Differential Revision: https://reviews.llvm.org/D31946 llvm-svn: 307053
* DAGCombine: Combine BUILD_VECTOR to TRUNCATEZvi Rackover2017-07-031-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a combine for creating a truncate to replace a build_vector composed of extracts with indices that form a stride-2^N series. Example: v8i32 V = ... v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6)) --> v4i32 truncate (bitcast V to v4i64) Related discussion in llvm-dev about canonicalizing shuffles to truncates in LLVM IR: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html. Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena Reviewed By: delena Subscribers: guyblank, delena, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34077 llvm-svn: 307036
* fix trivial typos in comments; NFCHiroshi Inoue2017-07-031-1/+1
| | | | llvm-svn: 307004
* [SelectionDAGBuilder] Use EVT::getVectorVT instead of MVT::getVectorVT to ↵Craig Topper2017-07-011-1/+1
| | | | | | prevent a crash if the type isn't a simple VT. llvm-svn: 306950
* [RegisterCoalescer] Account for instructions deleted by ↵Sameer AbuAsal2017-06-301-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | removePartialredunduncy and in WorkList Summary: removePartialRedundency optimization introduces a state in the RegisterCoalescer where an instruction pointed to in the WorkList is deleted from the MBB and then removed from the ErasedList. This patch updates the ErasedList to be used globally by not erasing erased Instructions from it to solve the problem. The patch also accounts for the case where an Instruction was previously deleted and the same memory was reused by BuildMI to create a new instruction. Reviewers: kparzysz, qcolombet Reviewed By: qcolombet Subscribers: MatzeB, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D34902 llvm-svn: 306915
* [ORE] Add diagnostics hotness thresholdBrian Gesiak2017-06-301-0/+8
| | | | | | | | | | | | | | | | | | | | Summary: Add an option to prevent diagnostics that do not meet a minimum hotness threshold from being output. When generating optimization remarks for large codebases with a ton of cold code paths, this option can be used to limit the optimization remark output at a reasonable size. Discussion of this change can be read here: http://lists.llvm.org/pipermail/llvm-dev/2017-June/114377.html Reviewers: anemet, davidxl, hfinkel Reviewed By: anemet Subscribers: qcolombet, javed.absar, fhahn, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D34867 llvm-svn: 306912
* [codeview] Use the first valid source location at the top of every MBBReid Kleckner2017-06-302-5/+18
| | | | | | | | | | | | | | | | If the instructions at the beginning of the block have no location, we're better off using the location of the first instruction in the current basic block. At the very least, that instruction post-dominates this one, whereas if we don't emit a .cv_loc directive, we end up using the potentially invalid location that falls through from the previous block. We could probably do better here by emitting some kind of ".cv_loc end" directive that stops the line table entry of the previous .cv_loc directive from bleeding out of its basic block. This would improve the line table when an entire MBB has no valid location info. llvm-svn: 306889
* GlobalISel: add G_IMPLICIT_DEF instruction.Tim Northover2017-06-303-1/+17
| | | | | | | | | It looks like there are two target-independent but not GISel instructions that need legalization, IMPLICIT_DEF and PHI. These are already anomalies since their operands have important LLTs attached, so to make things more uniform it seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF. llvm-svn: 306875
* Drop the LLVM mangler escape when printing the IR name in assembly commentsReid Kleckner2017-06-301-1/+3
| | | | | | | I'm tired of seeing this: .globl "?Test@@YAXXZ" # -- Begin function ^A?Test@@YAXXZ llvm-svn: 306855
* [ORE] Unify spelling as "diagnostics hotness"Brian Gesiak2017-06-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: To enable profile hotness information in diagnostics output, Clang takes the option `-fdiagnostics-show-hotness` -- that's "diagnostics", with an "s" at the end. Clang also defines `CodeGenOptions::DiagnosticsWithHotness`. LLVM, on the other hand, defines `LLVMContext::getDiagnosticHotnessRequested` -- that's "diagnostic", not "diagnostics". It's a small difference, but it's confusing, typo-inducing, and frustrating. Add a new method with the spelling "diagnostics", and "deprecate" the old spelling. Reviewers: anemet, davidxl Reviewed By: anemet Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D34864 llvm-svn: 306848
* Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset"Nirav Dave2017-06-302-20/+45
| | | | | | | This reverts commit r306819 which appears be exposing underlying issues in a stage1 ppc64be build llvm-svn: 306820
OpenPOWER on IntegriCloud