summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* Make global aliases have symbol size equal to their typeJohn Brawn2015-07-172-0/+5
| | | | | | | | | | This is mainly for the benefit of GlobalMerge, so that an alias into a MergedGlobals variable has the same size as the original non-merged variable. Differential Revision: http://reviews.llvm.org/D10837 llvm-svn: 242520
* AArch64: make inexact signalling on round Darwin-specificTim Northover2015-07-162-75/+146
| | | | | | | | | C11 leaves the choice on whether round-to-integer operations set the inexact flag implementation-defined. Darwin does expect it to be set, but this seems to be against the intent of the IEEE document and slower to implement anyway. So it should be opt-in. llvm-svn: 242446
* AArch64: Implement conditional compare sequence matching.Matthias Braun2015-07-161-0/+96
| | | | | | | | | | | | | | | This is a new iteration of the reverted r238793 / http://reviews.llvm.org/D8232 which wrongly assumed that any and/or trees can be represented by conditional compare sequences, however there are some restrictions to that. This version fixes this and adds comments that explain exactly what types of and/or trees can actually be implemented as conditional compare sequences. Related to http://llvm.org/PR20927, rdar://18326194 Differential Revision: http://reviews.llvm.org/D10579 llvm-svn: 242436
* [SDAG] Optimize unordered comparison in soft-float mode (patch by Anton ↵Alexey Bataev2015-07-151-23/+8
| | | | | | | | | | | Nadolskiy) Current implementation handles unordered comparison poorly in soft-float mode. Consider (a ULE b) which is a <= b. It is lowered to (ledf2(a, b) <= 0 || unorddf2(a, b) != 0) (in general). We can do better job by lowering it to (__gtdf2(a, b) <= 0). Such replacement is true for other CMP's (ult, ugt, uge). In general, we just call same function as for ordered case but negate comparison against zero. Differential Revision: http://reviews.llvm.org/D10804 llvm-svn: 242280
* WebAssembly: fix build breakage.JF Bastien2015-07-141-3/+3
| | | | | | | | | | | | | | | Summary: processFunctionBeforeCalleeSavedScan was renamed to determineCalleeSaves and now takes a BitVector parameter as of rL242165, reviewed in http://reviews.llvm.org/D10909 WebAssembly is still marked as experimental and therefore doesn't build by default. It does, however, grep by default! I notice that processFunctionBeforeCalleeSavedScan is still mentioned in a few comments and error messages, which I also fixed. Reviewers: qcolombet, sunfish Subscribers: jfb, dsanders, hfinkel, MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D11199 llvm-svn: 242242
* [ShrinkWrap][PEI] Do not insert epilogue for unreachable blocks.Quentin Colombet2015-07-101-0/+39
| | | | | | | Although this is not incorrect to insert such code, it is useless and it hurts the binary size. llvm-svn: 241946
* Fix AArch64 prologue for empty frame with dynamic allocas.Evgeniy Stepanov2015-07-101-0/+50
| | | | | | | | Fixes PR23804: assertion failure in emitPrologue in the case of a function with an empty frame and a dynamic alloca that needs stack realignment. This is a typical case for AddressSanitizer. llvm-svn: 241943
* ComputeKnownBits: be a bit smarter about ADDsFiona Glaser2015-07-101-5/+5
| | | | | | | | If our two inputs have known top-zero bit counts M and N, we trivially know that the output cannot have any bits set in the top (min(M, N)-1) bits, since nothing could carry past that point. llvm-svn: 241927
* [AArch64] Select SBFIZ or UBFIZ instead of left + right shiftsArnaud A. de Grandmaison2015-07-091-0/+33
| | | | | | And rename LSB to Immr / MSB to Imms to match the ARM ARM terminology. llvm-svn: 241803
* Test for 241794 (nest attribute in AArch64)Renato Golin2015-07-091-0/+23
| | | | | | | | Forgot to git add the test. Patch by Stephen Cross. llvm-svn: 241797
* Add more nvcastsArnold Schwaighofer2015-07-071-0/+14
| | | | | | | | | Tim Northover has told me that they can occur when the compiler cleverly constructs constants - as demonstrated in the test case. rdar://21703486 llvm-svn: 241641
* Add CHECK lines to test caseArnold Schwaighofer2015-07-071-1/+8
| | | | llvm-svn: 241619
* Add a pattern for a nvcast from v2f64 -> v4f32Arnold Schwaighofer2015-07-071-0/+8
| | | | | | | | | Since the NvCast is generated by the selection process the concerns about endianess and bit reversal don't apply. rdar://21703486 llvm-svn: 241611
* Fix an overly aggressive assertion in getCopyFromPartsVector.Nadav Rotem2015-07-021-0/+18
| | | | | | | | | | | The assertion in getCopyFromPartsVector assumed that the vector 'part' must match the type of argument (arguments are potentially split into multiple parts). However, in some cases the targets return a 'part' of the right size but with a different type. We already handle this case correctly later on and generate a bitcast. This commit just makes sure that we are actually checking the property that we care about. llvm-svn: 241312
* [AArch64] Lower interleaved memory accesses to ldN/stN intrinsics. This ↵Hao Liu2015-06-261-0/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %ld2 = { <4 x i32>, <4 x i32> } call llvm.aarch64.neon.ld2(%ptr) %vec0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240754
* Fix "the the" in comments.Eric Christopher2015-06-191-4/+4
| | | | llvm-svn: 240112
* Avoid redundant select node in early if-conversion passYi Jiang2015-06-181-0/+41
| | | | llvm-svn: 240072
* Move the personality function from LandingPadInst to FunctionDavid Majnemer2015-06-173-9/+9
| | | | | | | | | | | | | | | | | | | The personality routine currently lives in the LandingPadInst. This isn't desirable because: - All LandingPadInsts in the same function must have the same personality routine. This means that each LandingPadInst beyond the first has an operand which produces no additional information. - There is ongoing work to introduce EH IR constructs other than LandingPadInst. Moving the personality routine off of any one particular Instruction and onto the parent function seems a lot better than have N different places a personality function can sneak onto an exceptional function. Differential Revision: http://reviews.llvm.org/D10429 llvm-svn: 239940
* [CodeGenPrepare] Generalize inserted set from truncs to any inst.Ahmed Bougacha2015-06-171-0/+16
| | | | | | | | | It's been used before to avoid infinite loops caused by separate CGP optimizations undoing one another. We found one more such issue caused by r238054. To avoid it, generalize the "InsertedTruncs" set to any inst, and use it to avoid touching those again. llvm-svn: 239938
* Revert "AArch64: Use CMP;CCMP sequences for and/or/setcc trees."Matthias Braun2015-06-171-40/+0
| | | | | | | | | The patch triggers a miscompile on SPEC 2006 403.gcc with the (ref) 200.i and scilab.i inputs. I opened PR23866 to track analysis of this. This reverts commit r238793. llvm-svn: 239880
* [AArch64] Generalize extract-high DUP extension to MOVI/MVNI.Ahmed Bougacha2015-06-161-0/+210
| | | | | | | | | | | | | | | | | | These are really immediate DUPs, and suffer from the same problem with long instructions with a high/2 variant (e.g. smull). By extending a MOVI (or DUP, before this patch), we can avoid an ext on the other operand of the long instruction, e.g. turning: ext.16b v0, v0, v0, #8 movi.4h v1, #0x53 smull.4s v0, v0, v1 into: movi.8h v1, #0x53 smull2.4s v0, v0, v1 While there, add a now-necessary combine to fold (VT NVCAST (VT x)). llvm-svn: 239799
* [AArch64] Robustize neon-2velem-high test. NFC.Ahmed Bougacha2015-06-161-111/+136
| | | | llvm-svn: 239798
* On behalf of Alexandros Lamprineas:Evgeny Astigeevich2015-06-151-0/+1
| | | | | | | | | | | | | | | | LLVM targeting aarch64 doesn't correctly produce aligned accesses for non-aligned data at -O0/fast-isel (-mno-unaligned-access). The root cause seems to be in fast-isel not producing unaligned access correctly for -mno-unaligned-access. The patch just aborts fast-isel for loads and stores when -mno-unaligned-access is present. The regression test is updated to check this new test case (-mno-unaligned-access together with fast-isel). Differential Revision: http://reviews.llvm.org/D10360 llvm-svn: 239732
* [AArch64] Delete two empty files, which should be removed by r239713.Hao Liu2015-06-151-0/+0
| | | | llvm-svn: 239715
* [AArch64] Revert r239711 again. We need to discuss how to share code between ↵Hao Liu2015-06-151-197/+0
| | | | | | AArch64 and ARM backend. llvm-svn: 239713
* [AArch64] Match interleaved memory accesses into ldN/stN instructions.Hao Liu2015-06-151-0/+197
| | | | | | | Re-commit after adding "-aarch64-neon-syntax=generic" to fix the failure on OS X. This patch was firstly committed in r239514, then reverted in r239544 because of a syntax incompatible failure on OS X. llvm-svn: 239711
* AArch64: map bare-metal arm64-macho triple to MachO MC layer.Tim Northover2015-06-121-0/+12
| | | | | | Far better than an assertion about expecting ELF. llvm-svn: 239647
* This reverts commit r239529 and r239514.Rafael Espindola2015-06-111-197/+0
| | | | | | | | | Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions." Revert "Fixing MSVC 2013 build error." The test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X. llvm-svn: 239544
* [AArch64] Match interleaved memory accesses into ldN/stN instructions.Hao Liu2015-06-111-0/+197
| | | | | | | | | | | | | | | | | | | | | | | Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true" E.g. Transform an interleaved load (Factor = 2): %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> ; Extract even elements %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> ; Extract odd elements Into: %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr) %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Transform an interleaved store (Factor = 2): %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7> ; Interleaved vec store <8 x i32> %i.vec, <8 x i32>* %ptr Into: %v0 = shuffle %i.vec, undef, <0, 1, 2, 3> %v1 = shuffle %i.vec, undef, <4, 5, 6, 7> call void aarch64.neon.st2(%v0, %v1, %ptr) llvm-svn: 239514
* [AArch64] Remove an overly conservative check when generating store pairs.Chad Rosier2015-06-091-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | Store instructions do not modify register values and therefore it's safe to form a store pair even if the source register has been read in between the two store instructions. Previously, the read of w1 (see below) prevented the formation of a stp. str w0, [x2] ldr w8, [x2, #8] add w0, w8, w1 str w1, [x2, #4] ret We now generate the following code. stp w0, w1, [x2] ldr w8, [x2, #8] add w0, w8, w1 ret All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass. Performance results for SPEC2K were within noise. llvm-svn: 239432
* [GlobalMerge] Take into account minsize on Global users' parents.Ahmed Bougacha2015-06-041-0/+74
| | | | | | | | | | Now that we can look at users, we can trivially do this: when we would have otherwise disabled GlobalMerge (currently -O<3), we can just run it for minsize functions, as it's usually a codesize win. Differential Revision: http://reviews.llvm.org/D10054 llvm-svn: 239087
* Don't create a MIN/MAX node if the underlying compare has more than one use.James Molloy2015-06-041-0/+11
| | | | | | | | | If the compare in a select pattern has another use then it can't be removed, so we'd just be creating repeated code if we created a min/max node. Spotted by Matt Arsenault! llvm-svn: 239037
* AArch64: Use CMP;CCMP sequences for and/or/setcc trees.Matthias Braun2015-06-011-0/+40
| | | | | | | | | | | | Previously CCMP/FCCMP instructions were only used by the AArch64ConditionalCompares pass for control flow. This patch uses them for SELECT like instructions as well by matching patterns in ISelLowering. PR20927, rdar://18326194 Differential Revision: http://reviews.llvm.org/D8232 llvm-svn: 238793
* Removing commited assembly file.Luke Cheeseman2015-06-011-3/+0
| | | | llvm-svn: 238742
* Re-commit of r238201 with fix for building with shared libraries.Luke Cheeseman2015-06-014-2/+53
| | | | llvm-svn: 238739
* Revert "Re-commit changes in r237579 with fix for bug breaking windows builds."Diego Novillo2015-05-263-50/+2
| | | | | | | This reverts commit r238201 to fix linking problems in x86 Linux http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html llvm-svn: 238223
* Re-commit changes in r237579 with fix for bug breaking windows builds.Luke Cheeseman2015-05-263-2/+50
| | | | llvm-svn: 238201
* [AArch64][CGP] Sink zext feeding stxr/stlxr into the same block.Ahmed Bougacha2015-05-221-4/+2
| | | | | | | | | | | The usual CodeGenPrepare trickery, on a target-specific intrinsic. Without this, the expansion of atomics will usually have the zext be hoisted out of the loop, defeating the various patterns we have to catch this precise case. Differential Revision: http://reviews.llvm.org/D9930 llvm-svn: 238054
* [AArch64] Robustize atomic cmpxchg test a little more. NFC.Ahmed Bougacha2015-05-221-45/+47
| | | | | | | We changed the test to test non-constant values in r238049. We can also use CHECK-NEXT to be a little stricter. llvm-svn: 238052
* [AArch64] Robustize atomic cmpxchg test. NFC.Ahmed Bougacha2015-05-221-13/+27
| | | | | | Constants are easy to get right the wrong way. llvm-svn: 238049
* [AArch64] Enhance the load/store optimizer with target-specific alias analysis.Chad Rosier2015-05-212-0/+171
| | | | | Phabricator: http://reviews.llvm.org/D9863 llvm-svn: 237963
* Revert r237579, as it broke windows buildbotsOliver Stannard2015-05-183-50/+2
| | | | llvm-svn: 237583
* [LLVM - ARM/AArch64] Add ACLE special register intrinsicsOliver Stannard2015-05-183-2/+50
| | | | | | | | | | | | | | | | | | | This patch implements LLVM support for the ACLE special register intrinsics in section 10.1, __arm_{w,r}sr{,p,64}. This patch is intended to lower the read/write_register instrinsics, used to implement the special register intrinsics in the clang patch for special register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor registers in AArch32 and AArch64. This is done by inspecting the register string passed to the intrinsic and then lowering to the appropriate instruction. Patch by Luke Cheeseman. Differential Revision: http://reviews.llvm.org/D9699 llvm-svn: 237579
* Mark SMIN/SMAX/UMIN/UMAX nodes as legal and add patterns for them.James Molloy2015-05-151-0/+96
| | | | | | | The new [SU]{MIN,MAX} SDNodes can be lowered directly to instructions for most NEON datatypes - the big exclusion being v2i64. llvm-svn: 237455
* Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math casesArtyom Skrobov2015-05-141-3/+31
| | | | | | No longer breaks SPEC2000/2006 llvm-svn: 237361
* Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in ↵Silviu Baranga2015-05-131-22/+1
| | | | | | SPEC2000/2006 llvm-svn: 237256
* [AArch64] Codegen VMAX/VMIN for safe math casesArtyom Skrobov2015-05-131-1/+22
| | | | llvm-svn: 237247
* Changed renaming of local symbols by inserting a dot vefore the numeric suffix.Sunil Srivastava2015-05-122-3/+3
| | | | | | | One code change and several test changes to match that details in http://reviews.llvm.org/D9481 llvm-svn: 237150
* llvm/test/CodeGen/AArch64/tailcall_misched_graph.ll: s/REQUIRE/REQUIRES/NAKAMURA Takumi2015-05-091-1/+1
| | | | llvm-svn: 236928
* ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not ↵Arnold Schwaighofer2015-05-081-0/+42
| | | | | | | | | | | | | | | | | | | | non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916
OpenPOWER on IntegriCloud