summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [BasicAA] Use PhiValuesAnalysis if available when handling phi aliasJohn Brawn2018-07-301-25/+67
| | | | | | | | | | | | | | | | By using PhiValuesAnalysis we can get all the values reachable from a phi, so we can be more precise instead of giving up when a phi has phi operands. We can't make BaseicAA directly use PhiValuesAnalysis though, as the user of BasicAA may modify the function in ways that PhiValuesAnalysis can't cope with. For this optional usage to work correctly BasicAAWrapperPass now needs to be not marked as CFG-only (i.e. it is now invalidated even when CFG is preserved) due to how the legacy pass manager handles dependent passes being invalidated, namely the depending pass still has a pointer to the now-dead dependent pass. Differential Revision: https://reviews.llvm.org/D44564 llvm-svn: 338242
* [GVNHoist] Re-enable GVNHoist by defaultAlexandros Lamprineas2018-07-302-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | My initial motivation for this came from https://reviews.llvm.org/D48122, where it was pointed out that my change didn't fit well in SimplifyCFG and therefore using GVNHoist was a better way to go. GVNHoist has been disabled for a while as there was a list of bugs related to it. I have fixed the following bugs: https://bugs.llvm.org/show_bug.cgi?id=37808 -> https://reviews.llvm.org/D48372 (rL337149) https://bugs.llvm.org/show_bug.cgi?id=36787 -> https://reviews.llvm.org/D49555 (rL337674) https://bugs.llvm.org/show_bug.cgi?id=37445 -> https://reviews.llvm.org/D49425 (rL337680) The next two bugs no longer occur, and it's unclear which commit fixed them: https://bugs.llvm.org/show_bug.cgi?id=36635 https://bugs.llvm.org/show_bug.cgi?id=37791 I investigated this one and proved to be unrelated to GVNHoist, but a genuine bug in NewGvn: https://bugs.llvm.org/show_bug.cgi?id=37660 To convince myself GVNHoist is in a good state I made a successful bootstrap build of LLVM. Merging this change now in order to make it to the LLVM 7.0.0 branch. Differential Revision: https://reviews.llvm.org/D49858 llvm-svn: 338240
* [MachineOutliner][X86] Use TAILJMPd64 instead of JMP_1 for TailCall constructionFrancis Visoiu Mistrih2018-07-301-1/+1
| | | | | | | | | | | | | | | | | | The machine verifier asserts with: Assertion failed: (isMBB() && "Wrong MachineOperand accessor"), function getMBB, file ../include/llvm/CodeGen/MachineOperand.h, line 542. It calls analyzeBranch which tries to call getMBB if the opcode is JMP_1, but in this case we do: JMP_1 @OUTLINED_FUNCTION I believe we have to use TAILJMPd64 instead of JMP_1 since JMP_1 is used with brtarget8. Differential Revision: https://reviews.llvm.org/D49299 llvm-svn: 338237
* Revert "[X86] Correct the immediate cost for 'add/sub i64 %x, 0x80000000'."Dean Michael Berris2018-07-301-7/+1
| | | | | | This reverts commit r338204. llvm-svn: 338236
* AMDGPU: Force skip over s_sendmsg and exp instructionsNicolai Haehnle2018-07-303-20/+35
| | | | | | | | | | | | | | | | | | | | | Summary: These instructions interact with hardware blocks outside the shader core, and they can have "scalar" side effects even when EXEC = 0. We don't want these scalar side effects to occur when all lanes want to skip these instructions, so always add the execz skip branch instruction for basic blocks that contain them. Also ensure that we skip scalar stores / atomics, though we don't code-gen those yet. Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48431 Change-Id: Ieaeb58352e2789ffd64745603c14970c60819d44 llvm-svn: 338235
* [ARM] Fix over-alignment in arguments that are HA of 128-bit vectorsPetr Pavlu2018-07-301-5/+6
| | | | | | | | | | | | | | | | | | | | Code in `CC_ARM_AAPCS_Custom_Aggregate()` is responsible for handling homogeneous aggregates for `CC_ARM_AAPCS_VFP`. When an aggregate ends up fully on stack, the function tries to pack all resulting items of the aggregate as tightly as possible according to AAPCS. Once the first item was laid out, the alignment used for consecutive items was the size of one item. This logic went wrong for 128-bit vectors because their alignment is normally only 64 bits, and so could result in inserting unexpected padding between the first and second element. The patch fixes the problem by updating the alignment with the item size only if this results in reducing it. Differential Revision: https://reviews.llvm.org/D49720 llvm-svn: 338233
* [RegisterScavenger] Fix debug printKarl-Johan Karlsson2018-07-301-1/+2
| | | | llvm-svn: 338231
* [NFC] Prepare GuardWidening for widening of cond branchesMax Kazantsev2018-07-301-27/+61
| | | | llvm-svn: 338229
* Try to fix build.Zachary Turner2018-07-301-5/+5
| | | | llvm-svn: 338227
* [MS Demangler] Demangle symbols in function scopes.Zachary Turner2018-07-302-12/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a couple of issues you run into when you start getting into more complex names, especially with regards to function local statics. When you've got something like: int x() { static int n = 0; return n; } Then this needs to demangle to something like int `int __cdecl x()'::`1'::n The nested mangled symbols (e.g. `int __cdecl x()` in the above example) also share state with regards to back-referencing, so we need to be able to re-use the demangler in the middle of demangling a symbol while sharing back-ref state. To make matters more complicated, there are a lot of ambiguities when demangling a symbol's qualified name, because a function local scope pattern (usually something like `?1??name?`) looks suspiciously like many other possible things that can occur, such as `?1` meaning the second back-ref and disambiguating these cases is rather interesting. The `?1?` in a local scope pattern is actually a special case of the more general pattern of `? + <encoded number> + ?`, where "encoded number" can itself have embedded `@` symbols, which is a common delimeter in mangled names. So we have to take care during the disambiguation, which is the reason for the overly complicated `isLocalScopePattern` function in this patch. I've added some pretty obnoxious tests to exercise all of this, which exposed several other problems related to back-referencing, so those are fixed here as well. Finally, I've uncommented some tests that were previously marked as `FIXME`, since now these work. Differential Revision: https://reviews.llvm.org/D49965 llvm-svn: 338226
* [DAGCombiner] Remove unnecessary calls to AddToWorklist.Craig Topper2018-07-291-46/+8
| | | | | | | | The DAGCombiner has a mechanism for ensuring all nodes have been visited at least once. Every time a node is visited, it makes sure its operands have been in the worklist at least once. This ensures that when multiple nodes are created by a combine, only the last node needs to be returned. The earlier nodes can all be found Through this operand check. These means we don't need to explicitly add nodes to the worklist when a combine creates multiple nodes. I've removed the most obvious cases here. There are probably more than can be removed. llvm-svn: 338222
* [InstCombine] try to fold 'add+sub' to 'not+add'Sanjay Patel2018-07-291-0/+8
| | | | | | | | | | | | | These are reassociated versions of the same pattern and similar transforms as in rL338200 and rL338118. The motivation is identical to those commits: Patterns with add/sub combos can be improved using 'not' ops. This is better for analysis and may lead to follow-on transforms because 'xor' and 'add' are commutative/associative. It can also help codegen. llvm-svn: 338221
* [MS Demangler] NFC - Remove state from Demangler class.Zachary Turner2018-07-291-152/+152
| | | | | | | | | | | We need to be able to initiate a nested demangling from inside of an "outer" demangling. These need to be able to share some state, such as back-references. As a result, we can't store things like the output stream or the mangled name in the Demangler class, since each demangling will have different values. So remove this state and pass it through the necessary methods. llvm-svn: 338219
* [InstSimplify] fold funnel shifts with 0-shift amountSanjay Patel2018-07-291-0/+13
| | | | llvm-svn: 338218
* [dsymutil] Simplify temporary file handling.Jonas Devlieghere2018-07-292-4/+8
| | | | | | | | | | | Dsymutil's update functionality was broken on Windows because we tried to rename a file while we're holding open handles to that file. TempFile provides a solution for this through its keep(Twine) method. This patch changes dsymutil to make use of that functionality. Differential revision: https://reviews.llvm.org/D49860 llvm-svn: 338216
* [InstSimplify] refactor intrinsic simplifications; NFCISanjay Patel2018-07-291-134/+116
| | | | llvm-svn: 338215
* revert r338206 because the test does not passSanjay Patel2018-07-291-23/+9
| | | | | | | Example of bot failure: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll llvm-svn: 338214
* [AVR] Re-enable expansion of ADDE/ADDC/SUBE/SUBC in ISelDylan McKay2018-07-291-0/+7
| | | | | | | | | This was disabled in r333748, which broke four tests. In the future, these need to be updated to UADDO/ADDCARRY or USUBO/SUBCARRY. llvm-svn: 338212
* [AArch64][SVE] Asm: Support for WHILE(LE|LO|LS|LT) instructions.Sander de Smalen2018-07-292-0/+45
| | | | | | | | | | | | | | | | | | | | The WHILE instructions generate a predicate that is true while the comparison of the first scalar operand (incremented for each predicate element) with the second scalar operand is true and false thereafter. WHILELE While incrementing signed scalar less than or equal to scalar WHILELO While incrementing unsigned scalar lower than scalar WHILELS While incrementing unsigned scalar lower than or same as scalar WHILELT While incrementing signed scalar less than scalar e.g. whilele p0.s, x0, x1 generates predicate p0 (for 32bit elements) by incrementing (signed) x0 and comparing that vector to splat(x1). llvm-svn: 338211
* [AArch64][SVE] Asm: Instructions to perform serialized operations.Sander de Smalen2018-07-292-0/+63
| | | | | | | | | | | | The instructions added in this patch permit active elements within a vector to be processed sequentially without unpacking the vector. PFIRST Set the first active element to true. PNEXT Find next active element in predicate. CTERMEQ Compare and terminate loop when equal. CTERMNE Compare and terminate loop when not equal. llvm-svn: 338210
* [MS Demangler] Refactor some of the name parsing code.Zachary Turner2018-07-281-181/+246
| | | | | | | | | | | | | | There are some very subtle differences between how one should parse symbol names and type names. They differ with respect to back-referencing, the set of legal values that can appear as the unqualified portion, and various other aspects. By separating the parsing code into separate paths, we can remove a lot of ambiguity during the demangling process, which is necessary for demangling more complicated things like function local statics, nested classes, and lambdas. llvm-svn: 338207
* Fix crash on inline asm with 64bit matching input in 32bit GPRThomas Preud'homme2018-07-281-9/+23
| | | | | | | | | Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338206
* [SelectionDAG] Pass std::vector by reference instead of by pointer to ↵Craig Topper2018-07-282-18/+14
| | | | | | | | | | BuildSDIV/BuildUDIV. This removes the need for an assert to ensure the pointer isn't null. Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here. llvm-svn: 338205
* [X86] Correct the immediate cost for 'add/sub i64 %x, 0x80000000'.Craig Topper2018-07-281-1/+7
| | | | | | X86 normally requires immediates to be a signed 32-bit value which would exclude i64 0x80000000. But for add/sub we can negate the constant and use the opposite instruction. llvm-svn: 338204
* [X86] Use alignTo and divideCeil to make some code more readable. NFCCraig Topper2018-07-281-3/+3
| | | | llvm-svn: 338203
* [InstCombine] try to fold 'sub' to 'not'Sanjay Patel2018-07-281-1/+7
| | | | | | | | | | | https://rise4fun.com/Alive/jDd Patterns with add/sub combos can be improved using 'not' ops. This is better for analysis and may lead to follow-on transforms because 'xor' and 'add' are commutative/associative. It can also help codegen. llvm-svn: 338200
* [AArch64][SVE] Asm: Support for PFALSE and PTEST instructions.Sander de Smalen2018-07-282-0/+45
| | | | | | | | This patch adds PFALSE (unconditionally sets all elements of the predicate to false) and PTEST (set the status flags for the predicate). llvm-svn: 338198
* AMDGPU: Stop wasting argument registers with v3i32/v3f32Matt Arsenault2018-07-282-0/+59
| | | | | | | | | | SelectionDAGBuilder widens v3i32/v3f32 arguments to to v4i32/v4f32 which consume an additional register. In addition to wasting argument space, this produces extra instructions since now it appears the 4th vector component has a meaningful value to most combines. llvm-svn: 338197
* [AArch64][SVE] Asm: Data-dependent loop predicate partitioning instructions.Sander de Smalen2018-07-282-0/+100
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for instructions that partition a predicate based on data-dependent termination conditions in a loop. BRKA Break after the first true condition BRKAS Break after the first true condition, setting condition flags BRKB Break before the first true condition BRKBS Break before the first true condition, setting condition flags BRKPA Break after the first true condition, propagating from the previous partition BRKPAS Break after the first true condition, propagating from the previous partition, setting condition flags BRKPB Break before the first true condition, propagating from the previous partition BRKPBS Break before the first true condition, propagating from the previous partition, setting condition flags BRKN Propagate break to next partition BKRNS Propagate break to next partition, setting condition flags llvm-svn: 338196
* DAG: Add calling convention argument to calling convention funcsMatt Arsenault2018-07-2819-124/+159
| | | | | | | | This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194
* AMDGPU: Stop trying to extend arguments for cloverMatt Arsenault2018-07-282-31/+1
| | | | | | | This was trying to replace i8/i16 arguments with i32, which was broken and no longer necessary. llvm-svn: 338193
* [GlobalOpt] Test array indices inside structs for out-of-bounds accessesDavid Green2018-07-281-71/+47
| | | | | | | | | | | | | | | | | | We now, from clang, can turn arrays of static short g_data[] = {16, 16, 16, 16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 0, 0}; into structs of the form @g_data = internal global <{ [8 x i16], [8 x i16] }> ... GlobalOpt will incorrectly SROA it, not realising that the access to the first element may overflow into the second. This fixes it by checking geps more thoroughly. I believe this makes the globalsra-partial.ll test case invalid as the %i value could be out of bounds. I've re-purposed it as a negative test for this case. Differential Revision: https://reviews.llvm.org/D49816 llvm-svn: 338192
* [InstCombine] Fold Select with AND/OR conditionDavid Bolvansky2018-07-281-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fold ``` %A = icmp ne i8 %X, %V1 %B = icmp ne i8 %X, %V2 %C = or i1 %A, %B %D = select i1 %C, i8 %X, i8 %V1 ret i8 %D => ret i8 %X Fixes https://bugs.llvm.org/show_bug.cgi?id=38334 Proof: https://rise4fun.com/Alive/plI8 Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D49919 llvm-svn: 338191
* [demangler] Fix an oss-fuzz bug from r338138Erik Pilkington2018-07-281-0/+8
| | | | | | | | Stack overflow on invalid. While collapsing references, we were skipping over a cycle check in ForwardTemplateReference leading to a stack overflow. This commit fixes the problem by duplicating the cycle check in ReferenceType. llvm-svn: 338190
* [DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)Craig Topper2018-07-281-0/+6
| | | | | | | | This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181
* [SimpleLoopUnswitch] Fix DT updates for trivial branch unswitching.Alina Sbirlea2018-07-281-2/+4
| | | | | | | | | | | | | | | Summary: Fixing 2 issues with the DT update in trivial branch switching, though I don't have a case where DT update fails. 1. After splitting ParentBB->UnswitchedBB edge, new edges become: ParentBB->LoopExitBB->UnswitchedBB, so remove ParentBB->LoopExitBB edge. 2. AFAIU, for multiple CFG changes, DT should be updated using batch updates, vs consecutive addEdge and removeEdge calls. Reviewers: chandlerc, kuhar Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D49925 llvm-svn: 338180
* Revert "[WebAssembly] Added default stack-only instruction mode for MC."Wouter van Oortmerssen2018-07-275-475/+252
| | | | | | | This reverts commit d3c9af4179eae7793d1487d652e2d4e23844555f. (SVN revision 338164) llvm-svn: 338176
* [Support] Remove unnecessary MemoryBuffer::anchor (where the destructor ↵Fangrui Song2018-07-271-2/+1
| | | | | | serves as the key function) llvm-svn: 338175
* [X86] Add support expanding multiplies by constant where the constant is ↵Craig Topper2018-07-271-18/+31
| | | | | | | | -3/-5/-9 multplied by a power of 2. These can be replaced with an LEA, a shift, and a negate. This seems to match what gcc and icc would do. llvm-svn: 338174
* [InstrProf] Don't register __llvm_profile_runtime_userReid Kleckner2018-07-271-1/+1
| | | | | | | | Refactor some FileCheck prefixes while I'm at it. Fixes PR38340 llvm-svn: 338172
* [WebAssembly] Added default stack-only instruction mode for MC.Wouter van Oortmerssen2018-07-275-252/+475
| | | | | | | | | | | | | | | | | | | | | | | Summary: Moved Explicit Locals pass to last. Made that pass obligatory. Made it convert from register to stack based instructions, and removed the registers. Fixes to related code that was expecting register based instructions. Added the correct testing flag to all tests, depending on what the format they were expecting so far. Translated one test to stack format as example: reg-stackify-stack.ll tested: llvm-lit -v `find test -name WebAssembly` unittests/MC/* Reviewers: dschuff, sunfish Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D49160 llvm-svn: 338164
* Recommit "Enable MachineOutliner by default under -Oz for AArch64"Jessica Paquette2018-07-273-0/+9
| | | | | | | | | | | | | | | | | | Fixed the ASAN failure from before in r338148, so recommiting. This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338160
* [MachineOutliner] Exit getOutliningCandidateInfo when we erase all candidatesJessica Paquette2018-07-272-1/+11
| | | | | | | | | | There was a missing check for if a candidate list was entirely deleted. This adds that check. This fixes an asan failure caused by running test/CodeGen/AArch64/addsub_ext.ll with the MachineOutliner enabled. llvm-svn: 338148
* [ARM] Add new target feature to fuse literal generationEvandro Menezes2018-07-273-19/+55
| | | | | | | | | | This feature enables the fusion of such operations on Cortex A57 and Cortex A72, as recommended in their Software Optimisation Guides, sections 4.14 and 4.11, respectively. Differential revision: https://reviews.llvm.org/D49563 llvm-svn: 338147
* [demangler] Support for reference collapsingErik Pilkington2018-07-271-46/+56
| | | | | | llvm.org/PR38323 llvm-svn: 338138
* Revert "Enable MachineOutliner by default under -Oz for AArch64"Jessica Paquette2018-07-273-9/+0
| | | | | | | | | | It failed an Asan test on a bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/21543/steps/check-llvm%20asan/logs/stdio Fixing that before recommitting. llvm-svn: 338136
* bpf: add missing RegState to notify MachineInstr verifier necessary register ↵Yonghong Song2018-07-271-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | usage Errors like the following are reported by: https://urldefense.proofpoint.com/v2/url?u=http-3A__lab.llvm.org-3A8011_builders_llvm-2Dclang-2Dx86-5F64-2Dexpensive-2Dchecks-2Dwin_builds_11261&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=929oWPCf7Bf2qQnir4GBtowB8ZAlIRWsAdTfRkDaK-g&s=9k-wbEUVpUm474hhzsmAO29VXVvbxJPWD9RTgCD71fQ&e= *** Bad machine code: Explicit definition marked as use *** - function: cal_align1 - basic block: %bb.0 entry (0x47edd98) - instruction: LDB $r3, $r2, 0 - operand 0: $r3 This is because RegState info was missing for ScratchReg inside expandMEMCPY. This caused incomplete register usage information to MachineInstr verifier which then would complain as there could be potential code-gen issue if the complained MachineInstr is used in place where register usage information matters even though the memcpy expanding is not in such case as it happens at the last stage of IR optimization pipeline. We should always specify those register usage information which compiler couldn't deduct automatically whenever we add a hardware register manually. Reported-by: Builder llvm-clang-x86_64-expensive-checks-win Build #11261 Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 338134
* Enable MachineOutliner by default under -Oz for AArch64Jessica Paquette2018-07-273-0/+9
| | | | | | | | | | | | | | | | This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338133
* [DAGCombiner] fold 'not' with signbit mathSanjay Patel2018-07-271-0/+45
| | | | | | | | | | | | | | | | | | | This is a follow-up suggested in D48970. Alive proofs: https://rise4fun.com/Alive/sII We can eliminate an instruction in the usual select-of-constants to bit hack transform by adjusting the add/sub with constant. This is always a win. There are more transforms that are likely wins, but they may need target hooks in case some targets do not benefit. This is another step towards making up for canonicalizing to select-of-constants in rL331486. llvm-svn: 338132
* [Support] Use unsigned char for xxHash 64-bitFangrui Song2018-07-271-3/+3
| | | | | | Before, the last 3 bytes were char-signedness dependent. llvm-svn: 338128
OpenPOWER on IntegriCloud