summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/Thumb2
Commit message (Collapse)AuthorAgeFilesLines
...
* temporarily revert r112664, it is causing a decoding conflict, and Chris Lattner2010-09-011-13/+0
| | | | | | the testcases should be merged. llvm-svn: 112711
* We have a chance for an optimization. Consider this code:Bill Wendling2010-08-311-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | int x(int t) { if (t & 256) return -26; return 0; } We generate this: tst.w r0, #256 mvn r0, #25 it eq moveq r0, #0 while gcc generates this: ands r0, r0, #256 it ne mvnne r0, #25 bx lr Scandalous really! During ISel time, we can look for this particular pattern. One where we have a "MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND instruction to 0. Something like this (greatly simplified): %r0 = ISD::AND ... ARMISD::CMPZ %r0, 0 @ sets [CPSR] %r0 = ARMISD::MOVCC 0, -26 @ reads [CPSR] All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR] when it's zero. The zero value will all ready be in the %r0 register and we only need to change it if the AND wasn't zero. Easy! llvm-svn: 112664
* Add alignment arguments to all the NEON load/store intrinsics.Bob Wilson2010-08-274-15/+15
| | | | | | | Update all the tests using those intrinsics and add support for auto-upgrading bitcode files with the old versions of the intrinsics. llvm-svn: 112271
* ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signedDaniel Dunbar2010-08-251-0/+14
| | | | | | | | | comparison that would overflow. - The other under/overflow cases can't actually happen because the immediates which would trigger them are legal (so we don't enter this code), but adjusted the style to make it clear the transform is always valid. llvm-svn: 112053
* Change ARM PKHTB and PKHBT instructions to use a shift_imm operand to avoidBob Wilson2010-08-171-1/+1
| | | | | | | printing "lsl #0". This fixes the remaining parts of pr7792. Make corresponding changes for encoding/decoding these instructions. llvm-svn: 111251
* Generalize a pattern for PKHTB: an SRL of 16-31 bits will guaranteeBob Wilson2010-08-161-0/+9
| | | | | | that the high halfword is zero. The shift need not be exactly 16 bits. llvm-svn: 111196
* Convert test to FileCheck.Bob Wilson2010-08-161-4/+19
| | | | llvm-svn: 111195
* Temporarily disable tail calls on ARM to work around some linker problems.Bob Wilson2010-08-132-0/+2
| | | | llvm-svn: 111050
* fix silly typoJim Grosbach2010-08-111-2/+2
| | | | llvm-svn: 110831
* Add a target triple, as the runtime library invocation varies a bit byJim Grosbach2010-08-111-3/+3
| | | | | | | | | platform. It's apparently "bl __muldf3" on linux, for example. Since that's not what we're checking here, it's more robust to just force a triple. We just wwant to check that the inline FP instructions are only generated on cpus that have them." llvm-svn: 110830
* Temporarily disable some failing tests, until they can beDan Gohman2010-08-111-2/+2
| | | | | | properly investigated. llvm-svn: 110825
* cortex m4 has floating point support, but only single precision.Jim Grosbach2010-08-111-0/+24
| | | | llvm-svn: 110810
* Report error if codegen tries to instantiate a ARM target when the cpu does ↵Evan Cheng2010-08-111-1/+1
| | | | | | support it. e.g. cortex-m* processors. llvm-svn: 110798
* - Add subtarget feature -mattr=+db which determine whether an ARM cpu has theEvan Cheng2010-08-111-0/+17
| | | | | | | | | memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785
* Re-apply r110655 with fixes. Epilogue must restore sp from fp if the ↵Evan Cheng2010-08-101-0/+53
| | | | | | | | function stack frame has a var-sized object. Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions. llvm-svn: 110707
* Many Thumb2 instructions can reference the full ARM register set (i.e.,Jim Grosbach2010-07-302-7/+26
| | | | | | | | | | | | | | | | | | | | | | | | have 4 bits per register in the operand encoding), but have undefined behavior when the operand value is 13 or 15 (SP and PC, respectively). The trivial coalescer in linear scan sometimes will merge a copy from SP into a subsequent instruction which uses the copy, and if that instruction cannot legally reference SP, we get bad code such as: mls r0,r9,r0,sp instead of: mov r2, sp mls r0, r9, r0, r2 This patch adds a new register class for use by Thumb2 that excludes the problematic registers (SP and PC) and is used instead of GPR for those operands which cannot legally reference PC or SP. The trivial coalescer explicitly requires that the register class of the destination for the COPY instruction contain the source register for the COPY to be considered for coalescing. This prevents errant instructions like that above. PR7499 llvm-svn: 109842
* Implement vector constants which are splat ofDale Johannesen2010-07-291-0/+38
| | | | | | | | | integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). llvm-svn: 109799
* update tests for smarter BIC usageJim Grosbach2010-07-203-6/+4
| | | | llvm-svn: 108846
* Add combiner patterns to more effectively utilize the BFI (bitfield insert)Jim Grosbach2010-07-171-0/+23
| | | | | | | instruction for non-constant operands. This includes the case referenced in the README.txt regarding a bitfield copy. llvm-svn: 108608
* Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instructionJim Grosbach2010-07-161-0/+17
| | | | | | | and a combine pattern to use it for setting a bit-field to a constant value. More to come for non-constant stores. llvm-svn: 108570
* Improve 64-subtraction of immediates when parts of the immediate can fitJim Grosbach2010-07-142-2/+103
| | | | | | | | | | | in the literal field of an instruction. E.g., long long foo(long long a) { return a - 734439407618LL; } rdar://7038284 llvm-svn: 108339
* Fix test to appease the buildbots.Bob Wilson2010-07-141-1/+1
| | | | llvm-svn: 108334
* Print "dregpair" NEON operands with a space between them, for readability andBob Wilson2010-07-091-1/+1
| | | | | | consistency with other instructions that have lists of register operands. llvm-svn: 107944
* Changes to ARM tail calls, mostly cosmetic.Dale Johannesen2010-07-081-0/+10
| | | | | | | | | Add explicit testcases for tail calls within the same module. Duplicate some code to humor those who think .w doesn't apply on ARM. Leave this disabled on Thumb1, and add some comments explaining why it's hard and won't gain much. llvm-svn: 107851
* PR7503: uxtb16 is not available for ARMv7-M. Patch by Brian G. Lucas.Evan Cheng2010-06-291-25/+68
| | | | llvm-svn: 107122
* Reapply my if-conversion cleanup from svn r106939 with fixes.Bob Wilson2010-06-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are 2 changes relative to the previous version of the patch: 1) For the "simple" if-conversion case, there's no need to worry about RemoveExtraEdges not handling an unanalyzable branch. Predicated terminators are ignored in this context, so RemoveExtraEdges does the right thing. This might break someday if we ever treat indirect branches (BRIND) as predicable, but for now, I just removed this part of the patch, because in the case where we do not add an unconditional branch, we rely on keeping the fall-through edge to CvtBBI (which is empty after this transformation). The change relative to the previous patch is: @@ -1036,10 +1036,6 @@ IterIfcvt = false; } - // RemoveExtraEdges won't work if the block has an unanalyzable branch, - // which is typically the case for IfConvertSimple, so explicitly remove - // CvtBBI as a successor. - BBI.BB->removeSuccessor(CvtBBI->BB); RemoveExtraEdges(BBI); // Update block info. BB can be iteratively if-converted. 2) My patch exposed a bug in the code for merging the tail of a "diamond", which had previously never been exercised. The code was simply checking that the tail had a single predecessor, but there was a case in MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was neither edge of the diamond. I added the following change to check for that: @@ -1276,7 +1276,18 @@ // tail, add a unconditional branch to it. if (TailBB) { BBInfo TailBBI = BBAnalysis[TailBB->getNumber()]; - if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) { + bool CanMergeTail = !TailBBI.HasFallThrough; + // There may still be a fall-through edge from BBI1 or BBI2 to TailBB; + // check if there are any other predecessors besides those. + unsigned NumPreds = TailBB->pred_size(); + if (NumPreds > 1) + CanMergeTail = false; + else if (NumPreds == 1 && CanMergeTail) { + MachineBasicBlock::pred_iterator PI = TailBB->pred_begin(); + if (*PI != BBI1->BB && *PI != BBI2->BB) + CanMergeTail = false; + } + if (CanMergeTail) { MergeBlocks(BBI, TailBBI); TailBBI.IsDone = true; } else { With these fixes, I was able to run all the SingleSource and MultiSource tests successfully. llvm-svn: 107110
* Revert my if-conversion cleanup since it caused a bunch of nightly testBob Wilson2010-06-261-1/+1
| | | | | | | | | | regressions. --- Reverse-merging r106939 into '.': U test/CodeGen/Thumb2/thumb2-ifcvt3.ll U lib/CodeGen/IfConversion.cpp llvm-svn: 106951
* Remove bogus test.Eli Friedman2010-06-261-22/+0
| | | | llvm-svn: 106941
* Clean up some problems with extra CFG edges being introduced duringBob Wilson2010-06-261-1/+1
| | | | | | | | | | | | | | | if-conversion. The RemoveExtraEdges function doesn't work for blocks that end with unanalyzable branches, so in those cases, the "extra" edges must be explicitly removed. The CopyAndPredicateBlock and MergeBlocks methods can also avoid copying successor edges due to branches that have already been removed. The latter case is especially helpful when MergeBlocks is called for handling "diamond" if-conversions, where otherwise you can end up with some weird intermediate states in the CFG. Unfortunately I've been unable to find cases where this cleanup actually makes a significant difference in the code. There is one test where we manage to remove an empty block at the end of a function. Radar 6911268. llvm-svn: 106939
* PR7458: Try commuting Thumb2 instruction operands to put them into 2-addressBob Wilson2010-06-242-2/+9
| | | | | | form so they can be narrowed to 16-bit instructions. llvm-svn: 106762
* Eliminate the first have of the optimization which eliminates BRCONDDan Gohman2010-06-241-4/+1
| | | | | | | | | | | | | when the condition is constant. This optimization shouldn't be necessary, because codegen shouldn't be able to find dead control paths that the IR-level optimizer can't find. And it's undesirable, because it encourages bugpoint to leave "br i1 false" branches in its output. And it wasn't updating the CFG. I updated all the tests I could, but some tests are too reduced and I wasn't able to meaningfully preserve them. llvm-svn: 106748
* Tail merging pass shall not break up IT blocks. rdar://8115404Evan Cheng2010-06-221-0/+127
| | | | llvm-svn: 106517
* Fix a crash caused by dereference of MBB.end(). rdar://8110842Evan Cheng2010-06-201-0/+35
| | | | llvm-svn: 106399
* Disable sibcall optimization for Thumb1 for now since ↵Evan Cheng2010-06-191-1/+1
| | | | | | Thumb1RegisterInfo::emitEpilogue is not expecting them. llvm-svn: 106368
* Move ARM if-conversion before post-ra scheduling.Evan Cheng2010-06-181-1/+1
| | | | llvm-svn: 106355
* Allow ARM if-converter to be run after post allocation scheduling.Evan Cheng2010-06-182-1/+2
| | | | | | | | | | | | | | | | - This fixed a number of bugs in if-converter, tail merging, and post-allocation scheduler. If-converter now runs branch folding / tail merging first to maximize if-conversion opportunities. - Also changed the t2IT instruction slightly. It now defines the ITSTATE register which is read by instructions in the IT block. - Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't change the instruction ordering in the IT block (since IT mask has been finalized). It also ensures no other instructions can be scheduled between instructions in the IT block. This is not yet enabled. llvm-svn: 106344
* TwoAddressInstructionPass::CoalesceExtSubRegs can insert INSERT_SUBREGJakob Stoklund Olesen2010-06-181-0/+28
| | | | | | | | | instructions, but it doesn't really understand live ranges, so the first INSERT_SUBREG uses an implicitly defined register. Fix it in LiveVariableAnalysis by adding the <undef> flag. llvm-svn: 106333
* Fix an inverted condition.Evan Cheng2010-06-182-3/+1
| | | | llvm-svn: 106330
* Enable tail calls on ARM by default, with someDale Johannesen2010-06-182-0/+113
| | | | | | | | | | | | | | basic tests. This has been well tested on Darwin but not elsewhere. It should work provided the linker correctly resolves B.W <label in other function> which it has not seen before, at least from llvm-based compilers. I'm leaving the arm-tail-calls switch in until I see if there's any problems because of that; it might need to be disabled for some environments. llvm-svn: 106299
* Remove arm_apcscc from the test files. It is the default and doing thisRafael Espindola2010-06-1741-164/+164
| | | | | | matches what llvm-gcc and clang now produce. llvm-svn: 106221
* Allow a register to be redefined multiple times in a basic block.Jakob Stoklund Olesen2010-06-161-0/+21
| | | | | | | | | | | | | | | | | | | | LiveVariableAnalysis was a bit picky about a register only being redefined once, but that really isn't necessary. Here is an example of chained INSERT_SUBREGs that we can handle now: 68 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1028<kill>, 14 register: %reg1040 +[70,134:0) 76 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1029<kill>, 13 register: %reg1040 replace range with [70,78:1) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,134:0) 0@78-(134) 1@70-(78) 84 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1030<kill>, 12 register: %reg1040 replace range with [78,86:2) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,134:0) 0@86-(134) 1@70-(78) 2@78-(86) 92 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1031<kill>, 11 register: %reg1040 replace range with [86,94:3) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,94:3)[94,134:0) 0@94-(134) 1@70-(78) 2@78-(86) 3@86-(94) rdar://problem/8096390 llvm-svn: 106152
* Make post-ra scheduling, anti-dep breaking, and register scavenger ↵Evan Cheng2010-06-162-1/+3
| | | | | | (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler. llvm-svn: 106091
* Remove the arm_aapcscc marker from the tests. It is the defaultRafael Espindola2010-06-153-9/+9
| | | | | | for the linux targets. llvm-svn: 106029
* Add CoalescerPair helper class.Jakob Stoklund Olesen2010-06-151-0/+41
| | | | | | | | | | | | | | | | | | | | Given a copy instruction, CoalescerPair can determine which registers to coalesce in order to eliminate the copy. It deals with all the subreg fun to determine a tuple (DstReg, SrcReg, SubIdx) such that: - SrcReg is a virtual register that will disappear after coalescing. - DstReg is a virtual or physical register whose live range will be extended. - SubIdx is 0 when DstReg is a physical register. - SrcReg can be joined with DstReg:SubIdx. CoalescerPair::isCoalescable() determines if another copy instruction is compatible with the same tuple. This fixes some NEON miscompilations where shuffles are getting coalesced as if they were copies. The CoalescerPair class will replace a lot of the spaghetti logic in JoinCopy later. llvm-svn: 105997
* More tail call removal.Dale Johannesen2010-06-041-1/+1
| | | | llvm-svn: 105485
* Remove tail call. A tail call version will follow.Dale Johannesen2010-06-041-1/+1
| | | | llvm-svn: 105438
* Remove tail call to preserve this test. A tailDale Johannesen2010-06-031-1/+1
| | | | | | call version will follow. llvm-svn: 105422
* Make this test not use tail calls. A tail callDale Johannesen2010-06-031-3/+3
| | | | | | version will follow. llvm-svn: 105419
* Thumb2 RSBS instructions were being printed without the 'S' suffix.Bob Wilson2010-05-241-0/+9
| | | | | | | Fix it by changing the T2I_rbin_s_is multiclass to handle the CPSR output and 'S' suffix in the same way as T2I_bin_s_irs. llvm-svn: 104531
* Recognize more BUILD_VECTORs and VECTOR_SHUFFLEs that can be implemented byBob Wilson2010-05-221-1/+2
| | | | | | | | copying VFP subregs. This exposed a bunch of dead code in the *spill-q.ll tests, so I tweaked those tests to keep that code from being optimized away. Radar 7872877. llvm-svn: 104415
OpenPOWER on IntegriCloud