summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Irreducible loop metadata for more accurate block frequency under PGO.Hiroshi Yamauchi2017-11-022-0/+14
| | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the block frequency analysis is an approximation for irreducible loops. The new irreducible loop metadata is used to annotate the irreducible loop headers with their header weights based on the PGO profile (currently this is approximated to be evenly weighted) and to help improve the accuracy of the block frequency analysis for irreducible loops. This patch is a basic support for this. Reviewers: davidxl Reviewed By: davidxl Subscribers: mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D39028 llvm-svn: 317278
* Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."Clement Courbet2017-11-022-8/+712
| | | | | | | | | undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage This reverts commit eea333c33fa73ad225ef28607795984829f65688. llvm-svn: 317213
* [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.Clement Courbet2017-11-022-712/+8
| | | | | | | | | | | | | | | | | Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around. See the discussion in PR33325 for more details. Reviewers: spatel Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D39456 llvm-svn: 317211
* [X86] Fix bug in legalize vector types - Split large loadsAyman Musa2017-11-021-1/+1
| | | | | | | | | | When splitting a large load to smaller legally-typed loads, the last load should be padded to reach the size of the previous one so a CONCAT_VECTORS node could reunite them again. The code currently pads the last load to reach the size of the first load (instead of the previous). Differential Revision: https://reviews.llvm.org/D38495 Change-Id: Ib60b55ed26ce901fabf68108daf52683fbd5013f llvm-svn: 317206
* [AsmPrinterDwarf] Add support for .cfi_restore directiveFrancis Visoiu Mistrih2017-11-025-0/+18
| | | | | | | | | | | | | | As of today we only use .cfi_offset to specify the offset of a CSR, but we never use .cfi_restore when the CSR is restored. If we want to perform a more advanced type of shrink-wrapping, we need to use .cfi_restore in order to switch the CFI state between blocks. This patch only aims at adding support for the directive. Differential Revision: https://reviews.llvm.org/D36114 llvm-svn: 317199
* Revert "Correct dwarf unwind information in function epilogue for X86"Petar Jovanovic2017-11-017-420/+14
| | | | | | | This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf buildbot failure (build #15606). llvm-svn: 317136
* Correct dwarf unwind information in function epilogue for X86Petar Jovanovic2017-11-017-14/+420
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: - CFI instructions do not affect code generation - Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Changed CFI instructions so that they: - are duplicable - are not counted as instructions when tail duplicating or tail merging - can be compared as equal Added CFIInstrInserter pass: - analyzes each basic block to determine cfa offset and register valid at its entry and exit - verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors - inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D35844 llvm-svn: 317100
* [SelectionDAG] computeKnownBits - use ashrInPlace on known bits of ISD::SRA ↵Simon Pilgrim2017-11-011-11/+3
| | | | | | input. NFCI. llvm-svn: 317087
* [DAGCombiner] Fix typos in comments. NFCCraig Topper2017-11-011-2/+2
| | | | llvm-svn: 317072
* [codeview] Merge file checksum entries for DIFiles with the same absolute pathReid Kleckner2017-10-312-4/+5
| | | | | | | | Change the map key from DIFile* to the absolute path string. Computing the absolute path isn't expensive because we already have a map that caches the full path keyed on DIFile*. llvm-svn: 317041
* [CGP] Fix the detection of trivial case for addressing modeSerguei Katkov2017-10-311-10/+9
| | | | | | | The address can be presented as a bitcast of baseReg. In this case it is still trivial but OriginalValue != baseReg. llvm-svn: 316980
* [CGP] Fix crash on i96 bit multiplyPhilip Reames2017-10-301-1/+1
| | | | | | | | Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725 If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area. There's a bunch of obviously wrong code in the same function. I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like. llvm-svn: 316967
* Fix unused variable warnings. NFCI.Simon Pilgrim2017-10-301-3/+0
| | | | llvm-svn: 316964
* [SelectionDAG] Tidyup computeKnownBits extension/truncation cases. NFCI.Simon Pilgrim2017-10-301-17/+4
| | | | | | We don't need to extend/truncate the Known structure before calling computeKnownBits - it will reset at the start of the function. llvm-svn: 316962
* Create instruction classes for identifying any atomicity of memory ↵Daniel Neilson2017-10-301-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | intrinsic. (NFC) Summary: For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html This patch fleshes out the instruction class hierarchy with respect to atomic and non-atomic memory intrinsics. With this change, the relevant part of the class hierarchy becomes: IntrinsicInst -> MemIntrinsicBase (methods-only class) -> MemIntrinsic (non-atomic intrinsics) -> MemSetInst -> MemTransferInst -> MemCpyInst -> MemMoveInst -> AtomicMemIntrinsic (atomic intrinsics) -> AtomicMemSetInst -> AtomicMemTransferInst -> AtomicMemCpyInst -> AtomicMemMoveInst -> AnyMemIntrinsic (both atomicities) -> AnyMemSetInst -> AnyMemTransferInst -> AnyMemCpyInst -> AnyMemMoveInst This involves some class renaming: ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst A script for doing this renaming in downstream trees is included below. An example of where the Any* classes should be used in LLVM is when reasoning about the effects of an instruction (ex: aliasing). --- Script for renaming AtomicMem* classes: PREFIXES="[<,([:space:]]" CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst" SUFFIXES="[;)>,[:space:]]" REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})" REGEX2="visitElementUnorderedAtomic(${CLASSES})" FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq ) SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g" SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g" for f in $FILES; do echo "Processing: $f" sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f done Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev Reviewed By: sanjoy Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38419 llvm-svn: 316950
* [SelectionDAG] Add VSELECT demanded elts support to computeKnownBits Simon Pilgrim2017-10-301-4/+4
| | | | llvm-svn: 316947
* [SelectionDAG] Add VSELECT support to computeKnownBits Simon Pilgrim2017-10-301-0/+1
| | | | llvm-svn: 316944
* [SelectionDAG] Add SELECT demanded elts support to ComputeNumSignBitsSimon Pilgrim2017-10-301-4/+5
| | | | llvm-svn: 316933
* [MC] Split out register def/use idx calls to make debugging simpler. NFCI.Simon Pilgrim2017-10-301-3/+4
| | | | llvm-svn: 316927
* [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).Clement Courbet2017-10-301-25/+26
| | | | | | | | | | | | - Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905
* [GlobalISel|ARM] : Allow legalizing G_FSUBJaved Absar2017-10-301-0/+4
| | | | | | | | Adding support for VSUB. Reviewed by: @rovka Differential Revision: https://reviews.llvm.org/D39261 llvm-svn: 316902
* [SelectionDAG] Add SEXT/AND/XOR/Or demanded elts support to ComputeNumSignBitsSimon Pilgrim2017-10-291-7/+11
| | | | llvm-svn: 316875
* [SelectionDAG] Add SRA/SHL demanded elts support to ComputeNumSignBitsSimon Pilgrim2017-10-291-3/+29
| | | | | | Introduce a isConstOrDemandedConstSplat helper function that can recognise a constant splat build vector for at least the demanded elts we care about. llvm-svn: 316866
* [SelectionDAG] Add support for INSERT_SUBVECTOR to computeKnownBitsSimon Pilgrim2017-10-281-0/+34
| | | | llvm-svn: 316847
* [SelectionDAG] Support 'bit preserving' floating points bitcasts on ↵Simon Pilgrim2017-10-281-7/+15
| | | | | | | | | | | | computeKnownBits/ComputeNumSignBits For cases where we know the floating point representations match the bitcasted integer equivalent, allow bitcasting to these types. This is especially useful for the X86 floating point compare results which return all/zero bits but as a floating point type. Differential Revision: https://reviews.llvm.org/D39289 llvm-svn: 316831
* Add a few missing headers for modularization/IWYU/etcDavid Blaikie2017-10-271-1/+1
| | | | | | | Several cases where class definitions are required for DenseMap pointer traits handling. llvm-svn: 316803
* [DAGCombine] Don't combine sext with extload if sextload is not supported ↵Guozhi Wei2017-10-271-1/+5
| | | | | | | | | | | | | | | and extload has multi users In function DAGCombiner::visitSIGN_EXTEND_INREG, sext can be combined with extload even if sextload is not supported by target, then if sext is the only user of extload, there is no big difference, no harm no benefit. if extload has more than one user, the combined sextload may block extload from combining with other zext, causes extra zext instructions generated. As demonstrated by the attached test case. This patch add the constraint that when sextload is not supported by target, sext can only be combined with extload if it is the only user of extload. Differential Revision: https://reviews.llvm.org/D39108 llvm-svn: 316802
* [CodeGen] Fix -Wunused-private-field warning on lld-x86_64-darwin13.Clement Courbet2017-10-271-2/+0
| | | | llvm-svn: 316765
* [CodeGen][ExpandMemCmp][NFC] Simplify load sequence generation.Clement Courbet2017-10-271-40/+33
| | | | llvm-svn: 316763
* DAG: Fold fma (fneg x), K, y -> fma x, -K, yMatt Arsenault2017-10-271-0/+8
| | | | llvm-svn: 316753
* Add subclass data to the FoldingSetNode for MemIntrinsicSDNodes.Sean Fertile2017-10-271-0/+2
| | | | | | | | | | | Not having the subclass data on an MemIntrinsicSDNodes means it was possible to try to fold 2 nodes with the same operands but differing MMO flags. This would trip an assertion when trying to refine the alignment between the 2 MachineMemOperands. Differential Revision: https://reviews.llvm.org/D38898 llvm-svn: 316737
* Revert "[CGP] Merge empty case blocks if no extra moves are added."Balaram Makam2017-10-271-36/+11
| | | | | | This reverts commit r316711. The domtree isn't getting updated correctly. llvm-svn: 316721
* [CGP] Merge empty case blocks if no extra moves are added.Balaram Makam2017-10-261-11/+36
| | | | | | | | | | | | | | | | Summary: Currently we skip merging when extra moves may be added in the header of switch instead of the case block, if the case block is used as an incoming block of a PHI. If all the incoming values of the PHIs are non-constants and the destination block is dominated by the switch block then extra moves are likely not added by ISel, so there is no need to skip merging in this case. Reviewers: efriedma, junbuml, davidxl, hfinkel, qcolombet Reviewed By: efriedma Subscribers: dberlin, kuhar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37343 llvm-svn: 316711
* [MachineModuleInfoImpls] Replace qsort with array_pod_sortMandeep Singh Grang2017-10-261-10/+4
| | | | | | | | | | | | | | | | Summary: This seems to be the only place in llvm we directly call qsort. We can replace this with a call to array_pod_sort. Also minor cleanup of the sorting function. Reviewers: bkramer, Eugene.Zelenko, rafael Reviewed By: bkramer Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D39214 llvm-svn: 316671
* Tidy up CountingFunctionInserter a little. NFC.Hans Wennborg2017-10-261-8/+4
| | | | | | | Use StringRef for CountingFunctionName, remove erroneous comment copied from InstructionNamer, and drop some trailing whitespace. llvm-svn: 316644
* Make the combiner check if shifts are legal before creating themAditya Nandakumar2017-10-251-2/+3
| | | | | | | | | | | | Summary: Make sure shifts are legal/specified by the legalizerinfo before creating it Reviewers: qcolombet, dsanders, rovka, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39264 llvm-svn: 316602
* Re-land "[CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads ↵Clement Courbet2017-10-251-195/+227
| | | | | | | | | | | | | (1)" Compute the actual decomposition only after deciding whether to expand of not. Else, it's easy to make the compiler OOM with: `memcpy(dst, src, 0xffffffffffffffff);`, which typically happens if someone mistakenly passes a negative value. Add a test. This reverts commit f8fc02fbd4ab33383c010d33675acf9763d0bd44. llvm-svn: 316567
* [MachineScheduler] Minor refactoring.Jonas Paulsson2017-10-251-13/+18
| | | | | | | | | | | | | | | | Duplicated code found in three places put into a new static function: /// Given a Count of resource usage and a Latency value, return true if a /// SchedBoundary becomes resource limited. static bool checkResourceLimit(unsigned LFactor, unsigned Count, unsigned Latency) { return (int)(Count - (Latency * LFactor)) > (int)LFactor; } Review: Florian Hahn, Matthias Braun https://reviews.llvm.org/D39235 llvm-svn: 316560
* DAG: Fix creating select with wrong condition typeMatt Arsenault2017-10-251-1/+10
| | | | | | | | | | | | | | | | | | | This code added in r297930 assumed that it could create a select with a condition type that is just an integer bitcast of the selected type. For AMDGPU any vselect is going to be scalarized (although the vector types are legal), and all select conditions must be i1 (the same as getSetCCResultType). This logic doesn't really make sense to me, but there's never really been a consistent policy in what the select condition mask type is supposed to be. Try to extend the logic for skipping the transform for condition types that aren't setccs. It doesn't seem quite right to me though, but checking conditions that seem more sensible (like whether the vselect is going to be expanded) doesn't work since this seems to depend on that also. llvm-svn: 316554
* Implement salavageDebugInfo functionality for SelectionDAG.Adrian Prantl2017-10-242-0/+35
| | | | | | | | | | | | | | Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds a hook that can be invoked before an SDNode that is associated with an SDDbgValue is erased to capture the effect of the deleted node in a DIExpression. The motivating example is an SDDebugValue attached to an ADD operation that gets folded into a LOAD+OFFSET operation. rdar://problem/32121503 llvm-svn: 316525
* Revert "[CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads (1)"Martin Bohme2017-10-241-201/+196
| | | | | | | | This reverts commit r316417, which causes internal compiles to OOM. I don't unfortunately have a self-contained test case but will follow up with courbet. llvm-svn: 316497
* Use range-based for loop. NFCAdrian Prantl2017-10-241-5/+2
| | | | llvm-svn: 316496
* Use range-based-for. NFCAdrian Prantl2017-10-241-6/+5
| | | | llvm-svn: 316485
* MIR: Print the register class or bank in vreg defsJustin Bogner2017-10-241-12/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank, %1(s64) = COPY %0(s64) would now be written as %1:gpr(s64) = COPY %0(s64) While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely. Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse. llvm-svn: 316479
* Doxygenify comments.Adrian Prantl2017-10-241-26/+25
| | | | llvm-svn: 316466
* [SelectionDAG] Add VSELECT support to ComputeNumSignBitsSimon Pilgrim2017-10-241-0/+1
| | | | llvm-svn: 316457
* [CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads (1)Clement Courbet2017-10-241-196/+201
| | | | | | | | | | | | | | | | | | | | | | Refactor ExpandMemcmp: - Stop duplicating the logic for computation of the sequence of loads to generate (thsi was done in three different places), this is now done only once in MemCmpExpansion::MemCmpExpansion(). - Add a FIXME to expose a bug with the computation of the number of loads when not all sizes are loadable. For example, on X86-32 + SSE, possible loads are {16,4,2,1} bytes. The current code considers that all loads starting at MaxLoadSize are possible. This is not an issue right now as vector loads are not enabled, so I'm not fixing the issue here to keep the change as small as possible. I'm going to address this in a subsequent revision, where I enable vector loads. See https://bugs.llvm.org/show_bug.cgi?id=34887 Differential Revision: https://reviews.llvm.org/D38498 llvm-svn: 316417
* [MC] Adding code padding for performance stability - infrastructure. NFC.Omer Paparo Bivas2017-10-241-4/+29
| | | | | | | | | | | | | | | | | Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved. The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known. This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding. Reviewers: aaboud zvi craig.topper gadi.haber Differential revision: https://reviews.llvm.org/D34393 Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e llvm-svn: 316413
* [MachineOutliner] Add optimisation remarks for successful outliningJessica Paquette2017-10-231-36/+75
| | | | | | | | | | | | | | | | | | | This commit adds optimisation remarks for outlining which fire when a function is successfully outlined. To do this, OutlinedFunctions must now contain references to their Candidates. Since the Candidates must still be sorted and worked on separately, this is done by working on everything in terms of shared_ptrs to Candidates. This is good; it means that we can easily move everything to outlining in terms of the OutlinedFunctions rather than the individual Candidates. This is far more intuitive than what's currently there! (Remarks are output when a function is created for some group of Candidates. In a later commit, all of the outlining logic should be rewritten so that we loop over OutlinedFunctions rather than over Candidates.) llvm-svn: 316396
* Fix buildbot breakageGeorge Burgess IV2017-10-231-0/+1
| | | | | | SP is only used in an assert. Caused by r316374. llvm-svn: 316377
OpenPOWER on IntegriCloud