summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert 374373: [Codegen] Alter the default promotion for saturating adds and ↵David Green2019-10-114-215/+397
| | | | | | | | | subs This commit is not extending the promoted integers as it should. Reverting whilst I look into the details. llvm-svn: 374592
* [TableGen] Fix a bug that MCSchedClassDesc is interfered between different ↵QingShan Zhang2019-10-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SchedModel Assume that, ModelA has scheduling resource for InstA and ModelB has scheduling resource for InstB. This is what the llvm::MCSchedClassDesc looks like: llvm::MCSchedClassDesc ModelASchedClasses[] = { ... InstA, 0, ... InstB, -1,... }; llvm::MCSchedClassDesc ModelBSchedClasses[] = { ... InstA, -1,... InstB, 0,... }; The -1 means invalid num of macro ops, while it is valid if it is >=0. This is what we look like now: llvm::MCSchedClassDesc ModelASchedClasses[] = { ... InstA, 0, ... InstB, 0,... }; llvm::MCSchedClassDesc ModelBSchedClasses[] = { ... InstA, 0,... InstB, 0,... }; And compiler hit the assertion here because the SCDesc is valid now for both InstA and InstB. Differential Revision: https://reviews.llvm.org/D67950 llvm-svn: 374524
* [Codegen] Alter the default promotion for saturating adds and subsDavid Green2019-10-104-397/+215
| | | | | | | | | | | | | | | | | The default promotion for the add_sat/sub_sat nodes currently does: 1. ANY_EXTEND iN to iM 2. SHL by M-N 3. [US][ADD|SUB]SAT 4. L/ASHR by M-N If the promoted add_sat or sub_sat node is not legal, this can produce code that effectively does a lot of shifting (and requiring large constants to be materialised) just to use the overflow flag. It is simpler to just do the saturation manually, using the higher bitwidth addition and a min/max against the saturating bounds. That is what this patch attempts to do. Differential Revision: https://reviews.llvm.org/D68643 llvm-svn: 374373
* [IfCvt][ARM] Optimise diamond if-conversion for code sizeOliver Stannard2019-10-101-0/+559
| | | | | | | | | | | | | | | | | Currently, the heuristics the if-conversion pass uses for diamond if-conversion are based on execution time, with no consideration for code size. This adds a new set of heuristics to be used when optimising for code size. This is mostly target-independent, because the if-conversion pass can see the code size of the instructions which it is removing. For thumb, there are a few passes (insertion of IT instructions, selection of narrow branches, and selection of CBZ instructions) which are run after if conversion and affect these heuristics, so I've added target hooks to better predict the code-size effect of a proposed if-conversion. Differential revision: https://reviews.llvm.org/D67350 llvm-svn: 374301
* Add and adjust saturating tests. NFCDavid Green2019-10-094-0/+1418
| | | | | | | This adds some extra testing to the existing [su][add/sub]_sat X86 and AArch64 tests and adds equivalent tests for ARM. llvm-svn: 374169
* (Re)generate various tests. NFCAmaury Sechet2019-10-081-19/+38
| | | | llvm-svn: 374074
* [DebugInfo][If-Converter] Update call site info during the optimizationNikola Prica2019-10-081-0/+7
| | | | | | | | | | | | | | During the If-Converter optimization pay attention when copying or deleting call instructions in order to keep call site information in valid state. Reviewers: aprantl, vsk, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D66955 llvm-svn: 374068
* [ARM] Generate vcmp instead of vcmpeKristof Beyls2019-10-0814-156/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on the discussion in http://lists.llvm.org/pipermail/llvm-dev/2019-October/135574.html, the conclusion was reached that the ARM backend should produce vcmp instead of vcmpe instructions by default, i.e. not be producing an Invalid Operation exception when either arguments in a floating point compare are quiet NaNs. In the future, after constrained floating point intrinsics for floating point compare have been introduced, vcmpe instructions probably should be produced for those intrinsics - depending on the exact semantics they'll be defined to have. This patch logically consists of the following parts: - Revert http://llvm.org/viewvc/llvm-project?rev=294945&view=rev and http://llvm.org/viewvc/llvm-project?rev=294968&view=rev, which implemented fine-tuning for when to produce vcmpe (i.e. not do it for equality comparisons). The complexity introduced by those patches isn't needed anymore if we just always produce vcmp instead. Maybe these patches need to be reintroduced again once support is needed to map potential LLVM-IR constrained floating point compare intrinsics to the ARM instruction set. - Simply select vcmp, instead of vcmpe, see simple changes in lib/Target/ARM/ARMInstrVFP.td - Adapt lots of tests that tested for vcmpe (instead of vcmp). For all of these test, the intent of what is tested for isn't related to whether the vcmp should produce an Invalid Operation exception or not. Fixes PR43374. Differential Revision: https://reviews.llvm.org/D68463 llvm-svn: 374025
* ARM-Darwin: keep the frame register reserved even if not updated.Tim Northover2019-10-041-0/+15
| | | | | | | | Darwin platforms need the frame register to always point at a valid record even if it's not updated in a leaf function. Backtraces are more important than one extra GPR. llvm-svn: 373738
* Add a missing pass in ARM O3 pipelineJakub Kuderski2019-10-011-0/+1
| | | | llvm-svn: 373382
* [Dominators][CodeGen] Don't mark MachineDominatorTree as preserved in ↵Jakub Kuderski2019-10-011-0/+1
| | | | | | MachineLICM llvm-svn: 373378
* [TargetLowering] Simplify expansion of S{ADD,SUB}ORoger Ferrer Ibanez2019-09-301-117/+76
| | | | | | | | | | ISD::SADDO uses the suggested sequence described in the section §2.4 of the RISCV Spec v2.2. ISD::SSUBO uses the dual approach but checking for (non-zero) positive. Differential Revision: https://reviews.llvm.org/D47927 llvm-svn: 373187
* [ARM][CGP] Allow signext argumentsSam Parker2019-09-303-10/+60
| | | | | | | | | | | | As we perform a zext on any arguments used in the promoted tree, it doesn't matter if they're marked as signext. The only permitted user(s) in the tree which would interpret the sign bits are signed icmps. For these instructions, their promoted operands are truncated before the icmp uses them. Differential Revision: https://reviews.llvm.org/D68019 llvm-svn: 373186
* [test] Change llvm-readobj --arm-attributes to --arch-specific after r373125Fangrui Song2019-09-305-15/+15
| | | | llvm-svn: 373179
* [ARM] Cortex-M4 schedule additionsDavid Green2019-09-293-30/+29
| | | | | | | | | | | | | | | | | | | This is an attempt to fill in some of the missing instructions from the Cortex-M4 schedule, and make it easier to do the same for other ARM cpus. - Some instructions are marked as hasNoSchedulingInfo as they are pseudos or otherwise do not require scheduling info - A lot of features have been marked not supported - Some WriteRes's have been added for cvt instructions. - Some extra instruction latencies have been added, notably by relaxing the regex for dsp instruction to catch more cases, and some fp instructions. This goes a long way to get the CompleteModel working for this CPU. It does not go far enough as to get all scheduling info for all output operands correct. Differential Revision: https://reviews.llvm.org/D67957 llvm-svn: 373163
* [IfConversion] Disallow TBB == FBB for valid trianglesMikael Holmen2019-09-261-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously the case EBB | \_ | | | TBB | / FBB was treated as a valid triangle also when TBB and FBB was the same basic block. This could then lead to an invalid CFG when we removed the edge from EBB to TBB, since that meant we would also remove the edge from EBB to FBB. Since TBB == FBB is quite a degenerated case of a triangle, we now don't treat it as a valid triangle anymore, and thus we will avoid the trouble with updating the CFG. Reviewers: efriedma, dmgreen, kparzysz Reviewed By: efriedma Subscribers: bjope, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67832 llvm-svn: 372943
* [BreakFalseDeps] ignore function with minsize attributeSanjay Patel2019-09-231-1/+5
| | | | | | | | | | | | This came up in the x86-specific: https://bugs.llvm.org/show_bug.cgi?id=43239 ...but it is a general problem for the BreakFalseDeps pass. Dependencies may be broken by adding some other instruction, so that should be avoided if the overall goal is to minimize size. Differential Revision: https://reviews.llvm.org/D67363 llvm-svn: 372628
* Remove the obsolete BlockByRefStruct flag from LLVM IRAdrian Prantl2019-09-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | DIFlagBlockByRefStruct is an unused DIFlag that originally was used by clang to express (Objective-)C block captures in debug info. For the last year Clang has been emitting complex DIExpressions to describe block captures instead, which makes all the code supporting this flag redundant. This patch removes the flag and all supporting "dead" code, so we can reuse the bit for something else in the future. Since this only affects debug info generated by Clang with the block extension this mostly affects Apple platforms and I don't have any bitcode compatibility concerns for removing this. The Verifier will reject debug info that uses the bit and thus degrade gracefully when LTO'ing older bitcode with a newer compiler. rdar://problem/44304813 Differential Revision: https://reviews.llvm.org/D67453 llvm-svn: 372272
* [DAGCombine][ARM][X86] (sub Carry, X) -> (addcarry (sub 0, X), 0, Carry) foldRoman Lebedev2019-09-181-12/+14
| | | | | | | | | | | | | | | | | | | Summary: `DAGCombiner::visitADDLikeCommutative()` already has a sibling fold: `(add X, Carry) -> (addcarry X, 0, Carry)` This fold, as suggested by @efriedma, helps recover from //some// of the regressions of D62266 Reviewers: efriedma, deadalnix Subscribers: javed.absar, kristof.beyls, llvm-commits, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D62392 llvm-svn: 372259
* [ARM] VFPv2 only supports 16 D registers.Eli Friedman2019-09-174-14/+14
| | | | | | | | | | | | | | | | | | | | r361845 changed the way we handle "D16" vs. "D32" targets; there used to be a negative "d16" which removed instructions from the instruction set, and now there's a "d32" feature which adds instructions to the instruction set. This is good, but there was an oversight in the implementation: the behavior of VFPv2 was changed. In particular, the "vfp2" feature was changed to imply "d32". This is wrong: VFPv2 only supports 16 D registers. In practice, this means if you specify -mfpu=vfpv2, the compiler will generate illegal instructions. This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2" and "vfp2sp" so they don't imply "d32". Differential Revision: https://reviews.llvm.org/D67375 llvm-svn: 372186
* [ARM] Fixup pipeline test. NFCDavid Green2019-09-171-0/+1
| | | | llvm-svn: 372133
* [ARM] Fix for buildbotsSam Parker2019-09-171-1/+1
| | | | | | | | Remove setPreservesCFG from ARMConstantIslandPass and add a couple of -verify-machine-dom-info instances into the existing codegen tests. llvm-svn: 372126
* [ARM] LE support in ConstantIslandsSam Parker2019-09-171-1/+1
| | | | | | | | | | | | | The low-overhead branch extension provides a loop-end 'LE' instruction that performs no decrement nor compare, it just jumps backwards. This patch modifies the constant islands pass to try to insert LE instructions in place of a Thumb2 conditional branch, instead of shrinking it. This only happens if a cmp can be converted to a cbn/z and used to exit the loop. Differential Revision: https://reviews.llvm.org/D67404 llvm-svn: 372085
* [ARM][Codegen] Autogenerate arm-cgp-casts.ll test.Roman Lebedev2019-09-161-115/+1671
| | | | | | Apparently it got broken by r372009 while i thought it was r372012. llvm-svn: 372019
* [Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes ↵Guillaume Chatelet2019-09-1122-26/+26
| | | | | | | | | | | | | | | | | | | | | | mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608
* [NFC][ARM] Add and modify testsSam Parker2019-09-112-2/+134
| | | | | | Add test for ParallelDSP. llvm-svn: 371594
* [ARM] add test for BreakFalseDeps with minsize attribute; NFCSanjay Patel2019-09-101-0/+32
| | | | llvm-svn: 371526
* [ARM] auto-generate complete test checks; NFCSanjay Patel2019-09-101-9/+28
| | | | llvm-svn: 371524
* Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of ↵Dmitri Gribenko2019-09-101-0/+4
| | | | | | | | | CodeGen into opt pipeline."" This reverts commit r371502, it broke tests (clang/test/CodeGenCXX/auto-var-init.cpp). llvm-svn: 371507
* Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵Clement Courbet2019-09-101-4/+0
| | | | | | | | opt pipeline." With a fix for sanitizer breakage (see explanation in D60318). llvm-svn: 371502
* [IfConversion] Correctly handle cases where analyzeBranch fails.Eli Friedman2019-09-091-3/+2
| | | | | | | | | | | | | | | | If analyzeBranch fails, on some targets, the out parameters point to some blocks in the function. But we can't use that information, so make sure to clear it out. (In some places in IfConversion, we assume that any block with a TrueBB is analyzable.) The change to the testcase makes it trigger a bug on builds without this fix: IfConvertDiamond tries to perform a followup "merge" operation, which isn't legal, and we somehow end up with a branch to a deleted MBB. I'm not sure how this doesn't crash the compiler. Differential Revision: https://reviews.llvm.org/D67306 llvm-svn: 371434
* [ARM][ParallelDSP] Fix for sext inputSam Parker2019-09-092-1/+187
| | | | | | | | | | The incoming accumulator value can be discovered through a sext, in which case there will be a mismatch between the input and the result. So sign extend the accumulator input if we're performing a 64-bit mac. Differential Revision: https://reviews.llvm.org/D67220 llvm-svn: 371370
* [DAGCombiner][X86][ARM] Teach visitMULO to fold multiplies with 0 to 0 and ↵Craig Topper2019-09-081-2/+2
| | | | | | | | | no carry. I modified the ARM test to use two inputs instead of 0 so the test hopefully still tests what was intended. llvm-svn: 371344
* [ARM] Fix for buildbotSam Parker2019-09-061-0/+3
| | | | llvm-svn: 371187
* [IfConversion] Fix diamond conversion with unanalyzable branches.Eli Friedman2019-09-051-0/+58
| | | | | | | | | | | | | | | | | | The code was incorrectly counting the number of identical instructions, and therefore tried to predicate an instruction which should not have been predicated. This could have various effects: a compiler crash, an assembler failure, a miscompile, or just generating an extra, unnecessary instruction. Instead of depending on TargetInstrInfo::removeBranch, which only works on analyzable branches, just remove all branch instructions. Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and https://bugs.llvm.org/show_bug.cgi?id=41121 . Differential Revision: https://reviews.llvm.org/D67203 llvm-svn: 371111
* [LLVM][Alignment] Make functions using log of alignment explicitGuillaume Chatelet2019-09-055-12/+12
| | | | | | | | | | | | | | | | | | | | | Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045
* [ARM][ParallelDSP] SExt mul for accumulationSam Parker2019-09-044-0/+375
| | | | | | | | | | For any unpaired muls, we accumulate them as an input to the reduction. Check the type of the mul and perform a sext if the existing accumlator input type is not the same. Differential Revision: https://reviews.llvm.org/D66993 llvm-svn: 370851
* Bug fix on function epilog optimization (ARM backend)Oliver Stannard2019-09-031-0/+13
| | | | | | | | | | | | | | | To save a 'add sp,#val' instruction by adding registers to the final pop instruction, the first register transferred by this pop instruction need to be found. If the function to be optimized has a non-void return value, the operand list contains r0 (implicit) which prevents the optimization to take place. Therefore implicit register references should be skipped in the search loop, because this registers are never popped from the stack. Patch by Rainer Herbertz (rOptimizer)! Differential revision: https://reviews.llvm.org/D66730 llvm-svn: 370728
* [TargetLowering] Fix Bugzilla ID 43183 to avoid soften comparison broken ↵Shiva Chen2019-09-011-0/+46
| | | | | | | | | | with constant inputs Summary: This fixes the bugzilla id 43183 which triggerd by the following commit: [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall llvm-svn: 370604
* Revert [MBP] Disable aggressive loop rotate in plain modeJordan Rupprecht2019-08-296-21/+20
| | | | | | | | This reverts r369664 (git commit 51f48295cbe8fa3a44db263b528dd9f7bae7bf9a) It causes many benchmark regressions, internally and in llvm's benchmark suite. llvm-svn: 370398
* [DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locationsJeremy Morse2019-08-291-3/+0
| | | | | | | | | | | | | | | | | The missing line added by this patch ensures that only spilt variable locations are candidates for being restored from the stack. Otherwise, register or constant-value information can be interpreted as a spill location, through a union. The added regression test replicates a scenario where this occurs: the stack load from [rsp] causes the register-location DBG_VALUE to be "restored" to rsi, when it should be left alone. See PR43058 for details. Un x-fail a test that was suffering from this from a previous patch. Differential Revision: https://reviews.llvm.org/D66895 llvm-svn: 370334
* [DebugInfo] LiveDebugValues should always revisit backedges if it skips themJeremy Morse2019-08-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | The "join" method in LiveDebugValues does not attempt to join unseen predecessor blocks if their out-locations aren't yet initialized, instead the block should be re-visited later to see if any locations have changed validity. However, because the set of blocks were all being "process"'d once before "join" saw them, that logic in "join" was actually ignoring legitimate out-locations on the first pass through. This meant that some invalidated locations were not removed from the head of loops, allowing illegal locations to persist. Fix this by removing the run of "process" before the main join/process loop in ExtendRanges. Now the unseen predecessors that "join" skips truly are uninitialized, and we come back to the block at a later time to re-run "join", see the @baz function added. This also fixes another fault where stack/register transfers in the entry block (or any other before-any-loop-block) had their tranfers initially ignored, and were then never revisited. The MIR test added tests for this behaviour. XFail a test that exposes another bug; a fix for this is coming in D66895. Differential Revision: https://reviews.llvm.org/D66663 llvm-svn: 370328
* [ARM][ParallelDSP] Change search for mulsSam Parker2019-08-289-9/+733
| | | | | | | | | | | | | | | | | | | | rL369567 reverted a couple of recent changes made to ARMParallelDSP because of a miscompilation error: PR43073. The issue stemmed from an underlying bug that was caused by adding muls into a reduction before it was proved that they could be executed in parallel with another mul. Most of the changes here are from the previously reverted commits. The additional changes have been made area: 1) The Search function now doesn't insert any muls into the Reduction object. That now happens once the search has successfully finished. 2) For any muls added into the reduction but that weren't paired, we accumulate their values as an input into the smlad. Differential Revision: https://reviews.llvm.org/D66660 llvm-svn: 370171
* [NFC][Regalloc] Add testcases for D66576Zi Xuan Wu2019-08-261-0/+137
| | | | llvm-svn: 369877
* [ARM] Automatically generate dsp-mlal.ll . NFCAmaury Sechet2019-08-221-46/+177
| | | | llvm-svn: 369718
* [MBP] Disable aggressive loop rotate in plain modeGuozhi Wei2019-08-226-20/+21
| | | | | | | | | | Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664
* Reapply: [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32Sam Tebbs2019-08-221-2/+63
| | | | | | | | | The CodeGen/Thumb2/mve-vaddv.ll test needed to be amended to reflect the changes from the above patch. This reverts commit cd53ff6, reapplying 7c6b229. llvm-svn: 369638
* Revert r369626 "[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32"Hans Wennborg2019-08-221-63/+2
| | | | | | | | | | | | | | | | | It broke the bots, see e.g. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/36275/ > This patch fixes shifts by a 128/256 bit shift amount. It also fixes > codegen for shifts of 32 by delegating to LLVM's default optimisation > instead of emitting a long shift. > > Tests that used to generate long shifts of 32 are updated to check for the > more optimised codegen. > > Differential revision: https://reviews.llvm.org/D66519 > > llvm-svn: 369626 llvm-svn: 369636
* [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32Sam Tebbs2019-08-221-2/+63
| | | | | | | | | | | | | This patch fixes shifts by a 128/256 bit shift amount. It also fixes codegen for shifts of 32 by delegating to LLVM's default optimisation instead of emitting a long shift. Tests that used to generate long shifts of 32 are updated to check for the more optimised codegen. Differential revision: https://reviews.llvm.org/D66519 llvm-svn: 369626
* Revert r367389 (and follow-up r368404); it caused PR43073.Nico Weber2019-08-215-571/+3
| | | | llvm-svn: 369567
OpenPOWER on IntegriCloud