summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Promote f16 SELECT_CC CC operands when op is legal.Ahmed Bougacha2015-11-171-1/+7
| | | | | | | | | | | | | | | | SELECT_CC has the nasty property of having operands with unrelated types. So if you do something like: f32 = select_cc f16, f16, f32, f32, cc You'd only look for the action for <select_cc, f32>, but never f16. If the types are all legal, but the op isn't (as for f16 on AArch64, or for f128 on x86_64/AArch64?), then you get into trouble. For f128, we have softenSetCCOperands to handle this case. Similarly, for f16, we can directly promote the CC operands. llvm-svn: 253344
* [Assembler] Make fatal assembler errors non-fatalOliver Stannard2015-11-171-35/+53
| | | | | | | | | | | | | | Currently, if the assembler encounters an error after parsing (such as an out-of-range fixup), it reports this as a fatal error, and so stops after the first error. However, for most of these there is an obvious way to recover after emitting the error, such as emitting the fixup with a value of zero. This means that we can report on all of the errors in a file, not just the first one. MCContext::reportError records the fact that an error was encountered, so we won't actually emit an object file with the incorrect contents. Differential Revision: http://reviews.llvm.org/D14717 llvm-svn: 253328
* [ARM,AArch64] Store source location of asm constant pool entriesOliver Stannard2015-11-163-4/+5
| | | | | | | | | | Storing the source location of the expression that created a constant pool entry allows us to emit better error messages if we later discover that the expression cannot be represented by a relocation. Differential Revision: http://reviews.llvm.org/D14646 llvm-svn: 253220
* [ARM,AArch64] Store source location for values in assembly filesOliver Stannard2015-11-162-2/+2
| | | | | | | | | | | The MCValue class can store a SMLoc to allow better error messages to be emitted if an error is detected after parsing. The ARM and AArch64 assembly parsers were not setting this, so error messages did not have source information. Differential Revision: http://reviews.llvm.org/D14645 llvm-svn: 253219
* [AArch64] ldr= pseudo-instruction silently ignored if register invalidOliver Stannard2015-11-161-1/+1
| | | | | | | | | | | | | The AArch64 assembler was silently ignoring instructions like this: ldr foo, =bar AArch64AsmParser::parseOperand was returning true as the parse failed, but was not calling AArch64AsmParser::Error to report this to the user, so the instruction was ignored without printing an error message. Differential Revision: http://reviews.llvm.org/D14651 llvm-svn: 253193
* Reduce the size of MCRelaxableFragment.Akira Hatanaka2015-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127
* [MCTargetAsmParser] Move the member varialbes that referenceAkira Hatanaka2015-11-141-14/+14
| | | | | | | | | | MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a member function getSTI. This is done in preparation for making changes to shrink the size of MCRelaxableFragment. (see http://reviews.llvm.org/D14346). llvm-svn: 253124
* AArch64: Default AArch64Subtarget::ReserveX18 to true on darwinJustin Bogner2015-11-131-2/+3
| | | | | | | | | | | Darwin reserves x18, so it's never ABI compliant to generate code that uses it. Set the default value based on the OS part of the triple rather than forcing front-ends to set the +reserve-x18 target feature in order to build correct code for Darwin. This will make r243310 redundant, so I'll revert that shortly. llvm-svn: 253102
* [MC] Use LShr for constant evaluation of ">>" on non-arm64 darwin.Ahmed Bougacha2015-11-111-4/+0
| | | | | | | Follow-up to r235963: this matches other assemblers and is less unexpected (e.g. PR23227). llvm-svn: 252681
* [AArch64] add overrides for isCheapToSpeculateCttz() and ↵Sanjay Patel2015-11-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | isCheapToSpeculateCtlz() AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any AArch64 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz w0, w0 ret cttz: rbit w8, w0 clz w0, w8 ret Instead of: ctlz: cbz w0, .LBB0_2 clz w0, w0 ret .LBB0_2: orr w0, wzr, #0x20 ret cttz: cbz w0, .LBB1_2 rbit w8, w0 clz w0, w8 ret .LBB1_2: orr w0, wzr, #0x20 ret See D14469 for the larger motivation. Differential Revision: http://reviews.llvm.org/D14505 llvm-svn: 252625
* [AArch64] Fix halfword load merging for big-endian targetsOliver Stannard2015-11-101-3/+9
| | | | | | | | | | | | For big-endian targets, when we merge two halfword loads into a word load, the order of the halfwords in the loaded value is reversed compared to little-endian, so the load-store optimiser needs to swap the destination registers. This does not affect merging of two word loads, as we use ldp, which treats the memory as two separate 32-bit words. llvm-svn: 252597
* AArch64: add experimental support for address tagging.Tim Northover2015-11-103-5/+64
| | | | | | | | | | | | | AArch64 has the ability to use the top 8-bits of an "address" for extra information, with the memory subsystem automatically masking them off for loads and stores. When that's happening, we can sometimes skip masks on memory operations in the compiler. However, this requires the host OS and support stack to preserve those bits so it can't be enabled everywhere. In principle iOS 8.0 and above do take the required precautions and but we'll put it under a flag for now. llvm-svn: 252573
* don't repeat function names in comments; NFCSanjay Patel2015-11-091-19/+17
| | | | llvm-svn: 252502
* [AArch64] Add UABDL patterns for log2 shuffle.Charlie Turner2015-11-091-2/+34
| | | | | | | | | | | | | | | Summary: This matches the sum-of-absdiff patterns emitted by the vectoriser using log2 shuffles. Relies on D14207 to be able to match the `extract_subvector(..., 0)` Reviewers: t.p.northover, jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14208 llvm-svn: 252465
* [AArch64] Handle extract_subvector(..., 0) in ISel.Charlie Turner2015-11-092-18/+20
| | | | | | | | | | | | | | | Summary: Lowering this pattern early to an `EXTRACT_SUBREG` was making it impossible to match larger patterns in tblgen that use `extract_subvector(..., 0)` as part of the their input pattern. It seems like there will exist somewhere a better way of specifying this pattern over all relevant register value types, but I didn't manage to find it. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14207 llvm-svn: 252464
* [AsmParser] Backends can parameterize ASM tokenization.Colin LeMahieu2015-11-091-0/+2
| | | | llvm-svn: 252439
* [WinEH] Update exception pointer registersJoseph Tremoulet2015-11-072-5/+17
| | | | | | | | | | | | | | | | | | | | Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383
* [AArch64][FastISel] Don't even try to select vector icmps.Ahmed Bougacha2015-11-061-0/+4
| | | | | | | | | | | | We used to try to constant-fold them to i32 immediates. Given that fast-isel doesn't otherwise support vNi1, when selecting the result users, we'd fallback to SDAG anyway. However, if the users were in another block, we'd insert broken cross-class copies (GPR32 to FPR64). Give up, let SDAG agree with itself on a vNi1 legalization strategy. llvm-svn: 252364
* [AArch64]Enable the narrow ld promotion only on profitable microarchitecturesJun Bum Lim2015-11-061-8/+22
| | | | | | | | | The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316
* Remove windows line endings introduced by r252177. NFC.Tim Northover2015-11-051-16/+16
| | | | llvm-svn: 252217
* replace MachineCombinerPattern namespace and enum with enum class; NFCISanjay Patel2015-11-052-34/+34
| | | | | | | | Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196
* [DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268.Oleg Ranevskyy2015-11-051-16/+16
| | | | | | | | | | | | | | | | | | | Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177
* Remove templates from CostTableLookup functions. All instantiations had the ↵Craig Topper2015-10-281-2/+2
| | | | | | | | same type. This also lets us remove the versions of the functions that took a statically sized array as we can rely on ArrayRef implicit conversion now. llvm-svn: 251490
* [AArch64]Merge halfword loads into a 32-bit loadJun Bum Lim2015-10-271-45/+216
| | | | | | | | | | | | | | | | This recommits r250719, which caused a failure in SPEC2000.gcc because of the incorrect insert point for the new wider load. Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 251438
* Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add ↵Cong Hou2015-10-271-5/+6
| | | | | | | | | | | | | | successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429
* [ARM] Expand ROTL and ROTR of vector value typesCharlie Turner2015-10-271-0/+4
| | | | | | | | | | | | Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection. Reviewers: rengolin, t.p.northover Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14082 llvm-svn: 251401
* Convert cost table lookup functions to return a pointer to the entry or ↵Craig Topper2015-10-271-9/+8
| | | | | | | | | | nullptr instead of the index. This avoid mentioning the table name an extra time and allows the lookup to be done directly in the ifs by relying on the bool conversion of the pointer. While there make use of ArrayRef and std::find_if. llvm-svn: 251382
* [safestack] Fast access to the unsafe stack pointer on AArch64/Android.Evgeniy Stepanov2015-10-262-0/+20
| | | | | | | | | | | | | | | | | | | | | Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324
* Use MVT::SimpleValueType instead of MVT in template parameter. NFCCraig Topper2015-10-251-1/+2
| | | | llvm-svn: 251217
* Call the version of ConvertCostTableLookup that takes a statically sized ↵Craig Topper2015-10-241-3/+2
| | | | | | array rather than pointer and size. NFC llvm-svn: 251196
* Revert "[AArch64]Merge halfword loads into a 32-bit load"James Molloy2015-10-231-215/+45
| | | | | | This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53. llvm-svn: 251108
* AArch64: Disable the latency heuristicMatthias Braun2015-10-221-0/+5
| | | | | | | | | | | | It turned out not to improve any of our benchmarks but occasionally led to increased register pressure and spilling. Only enabling for the Cyclone CPU as the results on the cortex CPUs give mixed results. Differential Revision: http://reviews.llvm.org/D13708 llvm-svn: 251038
* Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. ↵Craig Topper2015-10-221-6/+4
| | | | | | This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032
* [AArch64]Merge halfword loads into a 32-bit loadJun Bum Lim2015-10-191-45/+215
| | | | | | | | | | | | | Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 250719
* Make a bunch of static arrays const.Craig Topper2015-10-183-11/+15
| | | | llvm-svn: 250642
* [AArch64] Implement vector splitting on UADDV.Charlie Turner2015-10-161-0/+32
| | | | | | | | | | | | Summary: Fixes PR25056. Reviewers: mcrosier, junbuml, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13466 llvm-svn: 250520
* Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android."Evgeniy Stepanov2015-10-152-20/+0
| | | | | | Breaks the hexagon buildbot. llvm-svn: 250461
* [safestack] Fast access to the unsafe stack pointer on AArch64/Android.Evgeniy Stepanov2015-10-152-0/+20
| | | | | | | | | | | | | | | | Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. llvm-svn: 250456
* AArch64: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-136-15/+12
| | | | llvm-svn: 250216
* [AArch64] Check the size of the vector before accessing its elements.Akira Hatanaka2015-10-131-1/+1
| | | | | | | | This fixes an assert in AArch64AsmParser::MatchAndEmitInstruction. rdar://problem/23081753 llvm-svn: 250207
* AArch64: Make getNextNode() cleanup in r249764 more clearDuncan P. N. Exon Smith2015-10-091-2/+2
| | | | | | | | | | | | After r249764, if you didn't see the full context, it looked like `std::next(I)` would get the same result as `++MachineBasicBlock::iterator(I)`. However, `I` is a `MachineInstr*` (not a `MachineBasicBlock::iterator`). Use the `getIterator()` helper I added later (r249782) to make this code more clear. llvm-svn: 249852
* Improve ISel across lane float min/max reductionJun Bum Lim2015-10-091-12/+47
| | | | | | | | | | | | | | | | | | | | In vectorized float min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : svn0 = vector_shuffle t0, undef<2,3,u,u> fmin = fminnum t0,svn0 svn1 = vector_shuffle fmin, undef<1,u,u,u> cc = setcc fmin, svn1, ole n0 = extract_vector_elt cc, #0 n1 = extract_vector_elt fmin, #0 n2 = extract_vector_elt fmin, #1 result = select n0, n1,n2 into : result = llvm.aarch64.neon.fminnmv t0 This change extends r247575. llvm-svn: 249834
* AArch64: Stop using MachineInstr::getNextNode()Duncan P. N. Exon Smith2015-10-081-4/+4
| | | | | | | | | | | | | Stop using `getNextNode()` to get an insertion point (at least, in this one place). Instead, use iterator logic directly. The `getNextNode()` interface isn't actually supposed to work for creating iterators; it's supposed to return `nullptr` (not a real iterator) if this is the last node. It's currently broken and will "happen" to work, but if we ever fix the function, we'll get some strange failures in places like this. llvm-svn: 249764
* Add Triple::isAndroid().Evgeniy Stepanov2015-10-081-0/+1
| | | | | | | This is a simple refactoring that replaces Triple.getEnvironment() checks for Android with Triple.isAndroid(). llvm-svn: 249750
* [AArch64] Fold a floating-point divide by power of two into fp conversion.Chad Rosier2015-10-071-0/+67
| | | | | | Part of http://reviews.llvm.org/D13442 llvm-svn: 249579
* [AArch64] Fold a floating-point multiply by power of two into fp conversion.Chad Rosier2015-10-071-0/+70
| | | | | | Part of http://reviews.llvm.org/D13442 llvm-svn: 249576
* [ARM][AArch64] Only lower to interleaved load/store if the target has NEONJeroen Ketema2015-10-071-4/+4
| | | | | | | | | Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550
* [MC layer][AArch64] llvm-mc accepts 4-bit immediate values forAlexandros Lamprineas2015-10-055-13/+87
| | | | | | | | | "msr pan, #imm", while only 1-bit immediate values should be valid. Changed encoding and decoding for msr pstate instructions. Differential Revision: http://reviews.llvm.org/D13011 llvm-svn: 249313
* Fix pr24486.Rafael Espindola2015-10-052-2/+2
| | | | | | | | | | | | | | | | | | This extends the work done in r233995 so that now getFragment (in addition to getSection) also works for variable symbols. With that the existing logic to decide if a-b can be computed works even if a or b are variables. Given that, the expression evaluation can avoid expanding variables as aggressively and that in turn lets the relocation code see the original variable. In order for this to work with the asm streamer, there is now a dummy fragment per section. It is used to assign a section to a symbol when no other fragment exists. This patch is a joint work by Maxim Ostapenko andy myself. llvm-svn: 249303
* [ARM] Typo. NFC.Chad Rosier2015-10-021-1/+1
| | | | llvm-svn: 249153
OpenPOWER on IntegriCloud