summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils/BypassSlowDivision.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [BypassSlowDivision][CodeGenPrepare] avoid crashing on unused code (PR43514)Sanjay Patel2019-10-011-2/+6
| | | | | | https://bugs.llvm.org/show_bug.cgi?id=43514 llvm-svn: 373394
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [BypassSlowDivision] Teach bypass slow division not to interfere with div by ↵Craig Topper2018-08-211-0/+9
| | | | | | | | | | | | | | constant where constants have been constant hoisted, but not moved from their basic block DAGCombiner doesn't pay attention to whether constants are opaque before doing the div by constant optimization. So BypassSlowDivision shouldn't introduce control flow that would make DAGCombiner unable to see an opaque constant. This can occur when a div and rem of the same constant are used in the same basic block. it will be hoisted, but not leave the block. Longer term we probably need to look into the X86 immediate cost model used by constant hoisting and maybe not mark div/rem immediates for hoisting at all. This fixes the case from PR38649. Differential Revision: https://reviews.llvm.org/D51000 llvm-svn: 340303
* Move Analysis/Utils/Local.h back to TransformsDavid Blaikie2018-06-041-1/+1
| | | | | | | | | | Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-1/+1
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* Fix a couple of layering violations in TransformsDavid Blaikie2018-03-211-1/+1
| | | | | | | | | | | | | Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165
* [BypassSlowDivision] Improve our handling of divisions by constantsSanjoy Das2017-12-041-7/+13
| | | | | | | | | | | | | | | | | | | (This reapplies r314253. r314253 was reverted on r314482 because of a correctness regression on P100, but that regression was identified to be something else.) Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 llvm-svn: 319677
* [Transforms] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-10-171-11/+24
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 316034
* Revert "[BypassSlowDivision] Improve our handling of divisions by constants"Sanjoy Das2017-09-291-13/+7
| | | | | | | This reverts commit r314253. It causes a miscompile on P100 in an internal benchmark. Reverting while I investigate. llvm-svn: 314482
* [BypassSlowDivision] Improve our handling of divisions by constantsSanjoy Das2017-09-261-7/+13
| | | | | | | | | | | | | | | Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 llvm-svn: 314253
* [BypassSlowDivision] move map helper code to header; NFCSanjay Patel2017-08-241-34/+2
| | | | | | | | We can reuse this code with other div/rem transforms as shown in: https://reviews.llvm.org/D31037 https://bugs.llvm.org/show_bug.cgi?id=31028 llvm-svn: 311661
* [KnownBits] Add bit counting methods to KnownBits struct and use them where ↵Craig Topper2017-05-121-2/+2
| | | | | | | | | | | | possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
* [ValueTracking] Introduce a KnownBits struct to wrap the two APInts for ↵Craig Topper2017-04-261-4/+5
| | | | | | | | | | | | | | | | computeKnownBits This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit. Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch. I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases. Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with. Differential Revision: https://reviews.llvm.org/D32376 llvm-svn: 301432
* [BypassSlowDivision] Do not bypass division of hash-like valuesNikolai Bozhenov2017-04-021-12/+81
| | | | | | | | | | | | | | | | | Disable bypassing if one of the operands looks like a hash value. Slow division often occurs in hashtable implementations and fast division is never taken there because a hash value is extremely unlikely to have enough upper bits set to zero. A value is considered to be hash-like if it is produced by 1) XOR operation 2) Multiplication by a constant wider than the shorter type 3) PHI node with all incoming values being hash-like Differential Revision: https://reviews.llvm.org/D28200 llvm-svn: 299329
* [BypassSlowDivision] Use ValueTracking to simplify run-time checksNikolai Bozhenov2017-03-021-29/+108
| | | | | | | | | | | | | | | | | | | | | | | | ValueTracking is used for more thorough analysis of operands. Based on the analysis, either run-time checks can be simplified (e.g. check only one operand instead of two) or the transformation can be avoided. For example, it is quite often the case that a divisor is promoted from a shorter type and run-time checks for it are redundant. With additional compile-time analysis of values, two special cases naturally arise and are addressed by the patch: 1) Both operands are known to be short enough. Then, the long division can be simply replaced with a short one without CFG modification. 2) If a division is unsigned and the dividend is known to be short then the long division is not needed at all. Because if the divisor is too big for short division then the quotient is obviously zero (and the remainder is equal to the dividend). Actually, the division is not needed when (divisor > dividend). Differential Revision: https://reviews.llvm.org/D29897 llvm-svn: 296832
* [BypassSlowDivision] Refactor fast division insertion logic (NFC)Nikolai Bozhenov2017-03-021-160/+218
| | | | | | | | | | The most important goal of the patch is to break large insertFastDiv function into separate pieces, so that later a different fast insertion logic can be implemented using some of these pieces. Differential Revision: https://reviews.llvm.org/D29896 llvm-svn: 296828
* [BypassSlowDivision] Handle division by constant numerators better.Justin Lebar2016-11-161-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We don't do BypassSlowDivision when the denominator is a constant, but we do do it when the numerator is a constant. This patch makes two related changes to BypassSlowDivision when the numerator is a constant: * If the numerator is too large to fit into the bypass width, don't bypass slow division (because we'll never run the smaller-width code). * If we bypass slow division where the numerator is a constant, don't OR together the numerator and denominator when determining whether both operands fit within the bypass width. We need to check only the denominator. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26699 llvm-svn: 287062
* [BypassSlowDivision] Simplify partially-tautological if statement.Justin Lebar2016-11-161-4/+3
| | | | | | if (A || (B && A)) --> if (A). llvm-svn: 287061
* Don't leave unused divs/rems sitting around in BypassSlowDivision.Justin Lebar2016-10-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This "pass" eagerly creates div and rem instructions even when only one is needed -- it relies on a later pass (machine DCE?) to clean them up. This is problematic not just from a cleanliness perspective (this pass is running during CodeGenPrepare, so should leave the IR in a better state), but it also creates a problem for instruction selection. If we always have a div+rem, isel will always select a divrem instruction (if possible), even when a single div or rem would do. Specifically, in NVPTX, we want to compute rem from the output of div, if available. But if a div is not available, we want to leave the rem alone. This transformation is overeager if div is always available. Because this code runs as part of CodeGenPrepare, it's nontrivial to write a test for this change. But this will effectively be tested by a later patch which adds the aforementioned change to NVPTX isel. Reviewers: tra Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26088 llvm-svn: 285460
* Don't claim the udiv created in BypassSlowDivision is exact.Justin Lebar2016-10-281-2/+1
| | | | | | | | | | | | | | | | | | | Summary: In BypassSlowDivision's short-dividend path, we would create e.g. udiv exact i32 %a, %b "exact" here means that we are asserting that %a is a multiple of %b. But we have no reason to believe this must be true -- this is just a bug, as far as I can tell. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D26097 llvm-svn: 285459
* Clarify that the bypassSlowDivision optimization operates on a single BB [v2]Eric Christopher2016-01-041-56/+44
| | | | | | | | | | | | | | | | Update some comments to be more explicit. Change bypassSlowDivision and the functions it calls so that they take BasicBlock*s and Instruction*s, rather than Function::iterator&s and BasicBlock::iterator&s. Change the APIs so that the caller is responsible for updating the iterator, rather than the callee. This makes control flow much easier to follow. Patch by Justin Lebar! llvm-svn: 256789
* TransformUtils: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-131-3/+3
| | | | | | | | | | | Continuing the work from last week to remove implicit ilist iterator conversions. First related commit was probably r249767, with some more motivation in r249925. This edition gets LLVMTransformUtils compiling without the implicit conversions. No functional change intended. llvm-svn: 250142
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-2/+2
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-2/+2
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-251-2/+2
| | | | llvm-svn: 207196
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | | | | | | definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
* Bypass Slow DividesPreston Gurd2013-03-041-2/+2
| | | | | | | | | | | | | * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-3/+3
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-3/+3
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* This patch corrects commit 165126 by using an integer bit width instead of Preston Gurd2012-10-041-9/+8
| | | | | | | | a pointer to a type, in order to remove the uses of getGlobalContext(). Patch by Tyler Nowicki. llvm-svn: 165255
* This Patch corrects a problem whereby the optimization to use a faster dividePreston Gurd2012-10-031-5/+15
| | | | | | | | | | | | | | instruction (for Intel Atom) was not being done by Clang, because the type context used by Clang is not the default context. It fixes the problem by getting the global context types for each div/rem instruction in order to compare them against the types in the BypassTypeMap. Tests for this will be done as a separate patch to Clang. Patch by Tyler Nowicki. llvm-svn: 165126
* Stylistic and 80-col fixesEvan Cheng2012-09-141-1/+1
| | | | llvm-svn: 163940
* Move bypassSlowDivision into the llvm namespace.Benjamin Kramer2012-09-101-4/+6
| | | | llvm-svn: 163503
* BypassSlowDivision: Assign to reference, don't copy the object.Jakub Staszak2012-09-041-2/+2
| | | | llvm-svn: 163179
* Fix my previous patch (r163164). It does now what it is supposed to do:Jakub Staszak2012-09-041-1/+0
| | | | | | Doesn't set MadeChange to TRUE if BypassSlowDivision doesn't change anything. llvm-svn: 163165
* Return false if BypassSlowDivision doesn't change anything.Jakub Staszak2012-09-041-33/+34
| | | | | | | | | | Also a few minor changes: - use pre-inc instead of post-inc - use isa instead of dyn_cast - 80 col - trailing spaces llvm-svn: 163164
* Generic Bypass Slow DivPreston Gurd2012-09-041-0/+251
- CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150
OpenPOWER on IntegriCloud