summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* LegalizeDAG: Implement PROMOTE for ISD::BITREVERSETom Stellard2016-10-131-1/+2
| | | | | | | | | | | | | | | Summary: This operation is promoted the same way was ISD::BSWAP. This will prevent a regression in test/Target/AMDGOU/bitreverse.ll when i16 support is implemented. Reviewers: bogner, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25202 llvm-svn: 284163
* [safestack] Reapply r283248 after moving X86-targeted SafeStack tests intoDavid L Kreitzer2016-10-131-7/+6
| | | | | | | | | | | | the X86 subdirectory. Original commit message: Requires a valid TargetMachine to be passed to the SafeStack pass. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24896 llvm-svn: 284161
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-10-132-121/+272
| | | | | | | | | UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157
* [RAGreedy] Empty live-ranges always succeed in last chance recoloring.Quentin Colombet2016-10-131-1/+12
| | | | | | | | | | | Relax the constraint for empty live-ranges while doing last chance recoloring. Indeed, those live-ranges do not need an actual color to be fond for the recoloring to work. Empty live-range may happen as a result of splitting/spilling. Unfortunately no test case for in-tree targets. llvm-svn: 284152
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-10-132-272/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 284151
* [DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), ↵Simon Pilgrim2016-10-131-7/+6
| | | | | | Y) style combines llvm-svn: 284122
* [DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A foldingSimon Pilgrim2016-10-131-5/+5
| | | | llvm-svn: 284117
* [DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalizationSimon Pilgrim2016-10-131-1/+12
| | | | | | Improves commutation potential llvm-svn: 284113
* Handle lane masks in LivePhysRegs when adding live-insKrzysztof Parzyszek2016-10-121-5/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D25533 llvm-svn: 284076
* Create llvm.addressofreturnaddress intrinsicAlbert Gutowski2016-10-124-2/+14
| | | | | | | | | | | | Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25293 llvm-svn: 284061
* [MIRParser] Parse lane masks for register live-insKrzysztof Parzyszek2016-10-124-24/+64
| | | | | | Differential Revision: https://reviews.llvm.org/D25530 llvm-svn: 284052
* Do not remove implicit defs in BranchFolderKrzysztof Parzyszek2016-10-122-55/+0
| | | | | | | | | | | Branch folder removes implicit defs if they are the only non-branching instructions in a block, and the branches do not use the defined registers. The problem is that in some cases these implicit defs are required for the liveness information to be correct. Differential Revision: https://reviews.llvm.org/D25478 llvm-svn: 284036
* BranchRelaxation: Unique live ins when creating blockMatt Arsenault2016-10-121-0/+1
| | | | llvm-svn: 284018
* [DAGCombiner] Update most ADD combines to support general vector combinesSimon Pilgrim2016-10-121-12/+54
| | | | | | | | Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors. Differential Revision: https://reviews.llvm.org/D25374 llvm-svn: 284015
* [DAGCombiner] Do not remove the load of stored values when optimizations are ↵Konstantin Zhuravlyov2016-10-121-1/+2
| | | | | | | | | | | | | | | | | | | | disabled This combiner breaks debug experience and should not be run when optimizations are disabled. For example: int main() { int j = 0; j += 2; if (j == 2) return 0; return 5; } When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function. Differential Revision: https://reviews.llvm.org/D19268 llvm-svn: 284014
* [DAG] Fix crash in build_vector -> vector_shuffle combineMichael Kuperstein2016-10-111-0/+5
| | | | | | | | Fixes a crash in the build_vector -> vector_shuffle combine when the first vector input is twice as wide as the output, and the second input vector is even wider. llvm-svn: 283953
* MIRParser: allow types on registers with a RegBank.Tim Northover2016-10-111-1/+2
| | | | | | This fixes some GlobalISel regression tests. llvm-svn: 283936
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-113-41/+329
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934
* Silence -Wunused-but-set-variable warningArnold Schwaighofer2016-10-111-0/+1
| | | | llvm-svn: 283927
* [DAG] add fold for masked negated sign-extended boolSanjay Patel2016-10-111-5/+11
| | | | | | | This enhances the fold added with: https://reviews.llvm.org/rL283900 llvm-svn: 283905
* [DAG] add fold for masked negated extended boolSanjay Patel2016-10-111-2/+15
| | | | | | | | | | | | | | The non-obvious motivation for adding this fold (which already happens in InstCombine) is that we want to canonicalize IR towards select instructions and canonicalize DAG nodes towards boolean math. So we need to recreate some folds in the DAG to handle that change in direction. An interesting implementation difference for cases like this is that InstCombine generally works top-down while the DAG goes bottom-up. That means we need to detect different patterns. In this case, the SimplifyDemandedBits fold prevents us from performing a zext to sext fold that would then be recognized as a negation of a sext. llvm-svn: 283900
* [DAG] simplify logic; NFCSanjay Patel2016-10-111-8/+6
| | | | llvm-svn: 283885
* [DAG] hoist DL(N) and fix formatting; NFCSanjay Patel2016-10-111-25/+32
| | | | llvm-svn: 283884
* [DAG] fix formatting; NFCSanjay Patel2016-10-111-72/+68
| | | | llvm-svn: 283878
* Fix formatting in findRegisterUseOperandIdx. NFC.Fraser Cormack2016-10-111-7/+5
| | | | llvm-svn: 283860
* Revert "Codegen: Tail-duplicate during placement."Daniel Jasper2016-10-113-330/+41
| | | | | | | | | This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857
* Fix warning; NFCMatthias Braun2016-10-111-2/+2
| | | | llvm-svn: 283851
* MIRParser: generic register operands with typesMatthias Braun2016-10-112-2/+3
| | | | | | This should fix the fallout of r283848. llvm-svn: 283850
* MIRParser: Rewrite register info initialization; mostly NFCMatthias Braun2016-10-114-108/+179
| | | | | | | | | | | | | | | | | | | | | | This changes MachineRegisterInfo to be initializes after parsing all instructions. This is in preparation for upcoming commits that allow the register class specification on the operand or deduce them from the MCInstrDesc. This commit removes the unused feature of having nonsequential register numbers. This was confusing anyway as the vreg numbers would be different after parsing when you had "holes" in your numbering. This patch also introduces the concept of an incomplete virtual register. An incomplete virtual register may be used during .mir parsing to construct MachineOperands without knowing the exact register class (or register bank) yet. NFC except for some error messages. Differential Revision: https://reviews.llvm.org/D22397 llvm-svn: 283848
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-113-41/+330
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842
* [RegAllocGreedy] Attempt to split unspillable live intervalsDylan McKay2016-10-112-6/+11
| | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 283838
* GlobalISel: select G_GLOBAL_VALUE uses on AArch64.Tim Northover2016-10-101-2/+2
| | | | llvm-svn: 283809
* [SelectionDAGBuilder] Support llvm.flt.rounds on targets where i32 is not legalHal Finkel2016-10-102-0/+15
| | | | | | | | | | | Add integer expansion for FLT_ROUNDS_ for targets where i32 is not a legal type. Patch by Edward Jones, thanks! Differential Revision: https://reviews.llvm.org/D24459 llvm-svn: 283797
* DAG: Setting Masked-Expand-Load as a variant of Masked-Load nodeElena Demikhovsky2016-10-092-7/+7
| | | | | | | | | | Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector. Right now, the node is used in intrinsics for VEXPAND* instructions. The work is done towards implementation of masked.expandload and masked.compressstore intrinsics. Differential Revision: https://reviews.llvm.org/D25322 llvm-svn: 283694
* Target: Remove unused entities.Peter Collingbourne2016-10-092-30/+0
| | | | llvm-svn: 283690
* Turn cl::values() (for enum) from a vararg function to using C++ variadic ↵Mehdi Amini2016-10-086-16/+9
| | | | | | | | | | | | | | | template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-083-326/+41
| | | | | | This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4. llvm-svn: 283647
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-073-41/+326
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283619
* swifterror: Don't compute swifterror vregs during instruction selectionArnold Schwaighofer2016-10-074-152/+184
| | | | | | | | | | | | | | | | | | | | | | | | | The code used llvm basic block predecessors to decided where to insert phi nodes. Instruction selection can and will liberally insert new machine basic block predecessors. There is not a guaranteed one-to-one mapping from pred. llvm basic blocks and machine basic blocks. Therefore the current approach does not work as it assumes we can mark predecessor machine basic block as needing a copy, and needs to know the set of all predecessor machine basic blocks to decide when to insert phis. Instead of computing the swifterror vregs as we select instructions, propagate them at the end of instruction selection when the MBB CFG is complete. When an instruction needs a swifterror vreg and we don't know the value yet, generate a new vreg and remember this "upward exposed" use, and reconcile this at the end of instruction selection. This will only happen if the target supports promoting swifterror parameters to registers and the swifterror attribute is used. rdar://28300923 llvm-svn: 283617
* [DAG] clean up foldSelectOfConstants(); NFCISanjay Patel2016-10-071-19/+14
| | | | | | | Rename variables, simplify logic. Not clear yet why we don't handle a target with ZeroOrNegativeOneBooleanContent too. llvm-svn: 283613
* [DAG] move fold (select C, 0, 1 -> xor C, 1) to a helper function; NFCSanjay Patel2016-10-071-16/+31
| | | | | | We're missing at least 3 other similar folds based on what we have in InstCombine. llvm-svn: 283596
* Invoke add-discriminator at -g0 -fsample-profileDehao Chen2016-10-072-2/+3
| | | | | | | | | | | | Summary: -fsample-profile needs discriminator, which will not be added if built with -g0. This patch makes sure the discriminator is added for sample-profile at -g0. A followup patch will be send out to update clang tests. Reviewers: davidxl, dblaikie, echristo, dnovillo Subscribers: mehdi_amini, probinson, llvm-commits Differential Revision: https://reviews.llvm.org/D25132 llvm-svn: 283565
* Only track physical registers in LivePhysRegsKrzysztof Parzyszek2016-10-071-3/+3
| | | | llvm-svn: 283561
* Delete some dead code in SelectionDAG (NFC)Vedant Kumar2016-10-063-59/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D24435 llvm-svn: 283505
* Preserve the debug location when CodeGenPrepare sinks a compare instruction ↵Wolfgang Pieb2016-10-061-0/+2
| | | | | | | | | | | | into the basic block of a user. Patch by Andrea DiBiagio. Differential Revision: https://reviews.llvm.org/D24632 llvm-svn: 283500
* Handle *_EXTEND_VECTOR_INREG during Integer LegalizationPirama Arumuga Nainar2016-10-062-0/+25
| | | | | | | | | | | | | | | | Summary: These nodes need legalization for 3-element vectors. This commit handles the legalization and adds tests for zext and sext. This fixes PR30614. Reviewers: RKSimon, srhines Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25268 llvm-svn: 283496
* [DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputsMichael Kuperstein2016-10-061-135/+191
| | | | | | | | | | | | | | | | This generalizes the build_vector -> vector_shuffle combine to support any number of inputs. The idea is to create a binary tree of shuffles, where the first layer performs pairwise shuffles of the input vectors placing each input element into the correct lane, and the rest of the tree blends these shuffles together. This doesn't try to be smart and create any sort of "optimal" shuffles. The assumption is that even a "poor" shuffle sequence is better than extracting and inserting the elements one by one. Differential Revision: https://reviews.llvm.org/D24683 llvm-svn: 283480
* BranchRelaxation: Support expanding unconditional branchesMatt Arsenault2016-10-061-5/+84
| | | | | | | AMDGPU needs to expand unconditional branches in a new block with an indirect branch. llvm-svn: 283464
* BranchRelaxation: Account for function alignmentMatt Arsenault2016-10-061-9/+18
| | | | llvm-svn: 283462
* Move AArch64BranchRelaxation to generic codeMatt Arsenault2016-10-063-0/+408
| | | | llvm-svn: 283459
OpenPOWER on IntegriCloud