summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Set trunc store action to Expand for all X86 targets.Bob Wilson2014-09-091-2/+2
| | | | | | | | When compiling without SSE2, isTruncStoreLegal(F64, F32) would return Legal, whereas with SSE2 it would return Expand. And since the Target doesn't seem to actually handle a truncstore for double -> float, it would just output a store of a full double in the space for a float hence overwriting other bits on the stack. Patch by Luqman Aden! llvm-svn: 217410
* [AArch64] Enabled AA support for Cortex-A57.Chad Rosier2014-09-081-1/+1
| | | | llvm-svn: 217381
* R600/SI: Fix assertion from copying a TargetGlobalAddressMatt Arsenault2014-09-081-1/+2
| | | | | | | | | | | | | Assert in scheduler from an inserted copy_to_regclass from a constant. This only seems to break sometimes when a constant initializer address is forced into VGPRs in a non-entry block. No test since the only case I've managed to hit only happens with a future patch, and that case will also not be a problem once scalar instructions are used in non-entry blocks. llvm-svn: 217380
* R600/SI: Replace LDS atomics with no return versionsMatt Arsenault2014-09-083-19/+35
| | | | llvm-svn: 217379
* R600/SI: Add InstrMapping for noret atomics.Matt Arsenault2014-09-083-50/+78
| | | | | | | | Only handles LDS atomics for now, and will be used to replace atomics with no uses with the no return versions. llvm-svn: 217378
* [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.Chad Rosier2014-09-082-0/+140
| | | | | | | Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator Review: http://reviews.llvm.org/D5103 llvm-svn: 217371
* [AArch64] Enabled AA support for Cortex-A53.Chad Rosier2014-09-081-0/+2
| | | | | | | Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator Review: http://reviews.llvm.org/D5103 llvm-svn: 217370
* Spelling correctionSid Manning2014-09-081-2/+2
| | | | | | Another trivial spelling change. llvm-svn: 217364
* [x86] Revert my over-eager commit in r217332.Chandler Carruth2014-09-071-25/+9
| | | | | | | I hadn't actually run all the tests yet and these combines have somewhat surprisingly far reaching effects. llvm-svn: 217333
* [x86] Tweak the rules surrounding 0,0 and 1,1 v2f64 shuffles and addChandler Carruth2014-09-071-9/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | support for MOVDDUP which is really important for matrix multiply style operations that do lots of non-vector-aligned load and splats. The original motivation was to add support for MOVDDUP as the lack of it regresses matmul_f64_4x4 by 5% or so. However, all of the rules here were somewhat suspicious. First, we should always be using the floating point domain shuffles, regardless of how many copies we have to make as a movapd is *crazy* faster than the domain switching cost on some chips. (Mostly because movapd is crazy cheap.) Because SHUFPD can't do the copy-for-free trick of the PSHUF instructions, there is no need to avoid canonicalizing on UNPCK variants, so do that canonicalizing. This also ensures we have the chance to form MOVDDUP. =] Second, we assume SSE2 support when doing any vector lowering, and given that we should just use UNPCKLPD and UNPCKHPD as they can operate on registers or memory. If vectors get spilled or come from memory at all this is going to allow the load to be folded into the operation. If we want to optimize for encoding size (the only difference, and only a 2 byte difference) it should be done *much* later, likely after RA. llvm-svn: 217332
* R600/SI: Fix register class for some 64-bit atomicsMatt Arsenault2014-09-071-5/+5
| | | | llvm-svn: 217323
* [x86] Fix a pretty horrible bug and inconsistency in the x86 asmChandler Carruth2014-09-066-53/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | parsing (and latent bug in the instruction definitions). This is effectively a revert of r136287 which tried to address a specific and narrow case of immediate operands failing to be accepted by x86 instructions with a pretty heavy hammer: it introduced a new kind of operand that behaved differently. All of that is removed with this commit, but the test cases are both preserved and enhanced. The core problem that r136287 and this commit are trying to handle is that gas accepts both of the following instructions: insertps $192, %xmm0, %xmm1 insertps $-64, %xmm0, %xmm1 These will encode to the same byte sequence, with the immediate occupying an 8-bit entry. The first form was fixed by r136287 but that broke the prior handling of the second form! =[ Ironically, we would still emit the second form in some cases and then be unable to re-assemble the output. The reason why the first instruction failed to be handled is because prior to r136287 the operands ere marked 'i32i8imm' which forces them to be sign-extenable. Clearly, that won't work for 192 in a single byte. However, making thim zero-extended or "unsigned" doesn't really address the core issue either because it breaks negative immediates. The correct fix is to make these operands 'i8imm' reflecting that they can be either signed or unsigned but must be 8-bit immediates. This patch backs out r136287 and then changes those places as well as some others to use 'i8imm' rather than one of the extended variants. Naturally, this broke something else. The custom DAG nodes had to be updated to have a much more accurate type constraint of an i8 node, and a bunch of Pat immediates needed to be specified as i8 values. The fallout didn't end there though. We also then ceased to be able to match the instruction-specific intrinsics to the instructions so modified. Digging, this is because they too used i32 rather than i8 in their signature. So I've also switched those intrinsics to i8 arguments in line with the instructions. In order to make the intrinsic adjustments of course, I also had to add auto upgrading for the intrinsics. I suspect that the intrinsic argument types may have led everything down this rabbit hole. Pretty happy with the result. llvm-svn: 217310
* [x86] Fix an embarressing bug in the INSERTPS formation code. The maskChandler Carruth2014-09-051-3/+4
| | | | | | | | | | | | | | computation was totally wrong, but somehow it didn't really show up with llc. I've added an assert that triggers on multiple existing test cases and updated one of them to show the correct value. There appear to still be more bugs lurking around insertps's mask. =/ However, note that this only really impacts the new vector shuffle lowering. llvm-svn: 217289
* [mips] Change Feature-related types from unsigned to uint64_t in ↵Toma Tabacu2014-09-051-2/+2
| | | | | | | | | | | | | | MipsAsmParser. No functional changes. Summary: Found a couple of cases where unsigned was still being used. These two should be the last ones in the (entire) Mips backend. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D5028 llvm-svn: 217257
* R600/SI: Use same complex patterns for DS atomicsMatt Arsenault2014-09-051-67/+47
| | | | | | | This fixes hitting the same negative base offset problem that was already fixed for regular loads and stores. llvm-svn: 217256
* [mips] Marked the Trap-on-Condition instructions as Mips IIDaniel Sanders2014-09-051-14/+20
| | | | | | | | | | | | Patch by Vasileios Kalintiris. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D5173 llvm-svn: 217255
* [mips] Rename data members and member functions in MipsAssemblerOptions.Toma Tabacu2014-09-051-14/+14
| | | | | | | | | | | | Summary: Use the naming convention from the LLVM Coding Standards. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D4972 llvm-svn: 217254
* R600: Fix FROUNDJan Vesely2014-09-052-4/+7
| | | | | | | | round halfway cases away from zero Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 217250
* R600/SI: Fix bug in SIInstrInfo::legalizeOpWithMove()Tom Stellard2014-09-051-4/+5
| | | | | | | | | | We must constrain the destination register class of legalized operands to a VGPR class or else the illegal operand may be folded back into the instruction by the register coalescer. This fixes a bug in add.ll that will be uncovered by future commits. llvm-svn: 217249
* R600/SI: Use S_ADD_U32 and S_SUB_U32 for low half of 64-bit operationsTom Stellard2014-09-053-6/+8
| | | | | | https://bugs.freedesktop.org/show_bug.cgi?id=83416 llvm-svn: 217248
* [x86] Factor out the zero vector insertion logic in the new vectorChandler Carruth2014-09-051-45/+95
| | | | | | | | | | shuffle lowering for integer vectors and share it from v4i32, v8i16, and v16i8 code paths. Ironically, the SSE2 v16i8 code for this is now better than the SSSE3! =] Will have to fix the SSSE3 code next to just using a single pshufb. llvm-svn: 217240
* ARM: cover all sub-architecture enumerators to keep compiler happy.Tim Northover2014-09-051-0/+2
| | | | | | No change in behaviour (hopefully). llvm-svn: 217233
* [AArch64] Add pass to enable additional comparison optimizations by CSE.Jiangning Liu2014-09-054-0/+414
| | | | | | | | | | | | | | | | | | Patched by Sergey Dmitrouk. This pass tries to make consecutive compares of values use same operands to allow CSE pass to remove duplicated instructions. For this it analyzes branches and adjusts comparisons with immediate values by converting: GE -> GT GT -> GE LT -> LE LE -> LT and adjusting immediate values appropriately. It basically corrects two immediate values towards each other to make them equal. llvm-svn: 217220
* X86: cpuid and xgetbv write to 32-bit registers, not 64-bitReid Kleckner2014-09-041-7/+3
| | | | | | | | This fixes an issue where MS inline assembly containing xgetbv wouldn't be marked as clobbering EAX:EDX. Test for that forthcoming on the Clang side. llvm-svn: 217173
* AArch64: fix vector-immediate BIC/ORR on big-endian devices.Tim Northover2014-09-042-12/+24
| | | | | | | | | | Follow up to r217138, extending the logic to other NEON-immediate instructions. As before, the instruction already performs the correct operation and we're just using a different type for convenience, so we want a true nop-cast. Patch by Asiri Rathnayake. llvm-svn: 217159
* [mips] Rename MipsAsmParser functions to conform to the LLVM Coding ↵Toma Tabacu2014-09-044-64/+64
| | | | | | | | | | | | | | Standards. No functional changes. Summary: There are still some functions which should be renamed, but they are inherited from the generic MC classes. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D5068 llvm-svn: 217145
* Silencing a usually-helpful-but-braindead-silly-in-this-case sign mismatch ↵Aaron Ballman2014-09-041-1/+1
| | | | | | warning with MSVC. NFC. llvm-svn: 217143
* AArch64: fix big-endian immediate materialisationTim Northover2014-09-043-21/+82
| | | | | | | | | | | | We were materialising big-endian constants using DAG nodes with types different from what was requested, followed by a bitcast. This is fine on little-endian machines where bitcasting is a nop, but we need a slightly different representation for big-endian. This adds a new set of NVCAST (natural-vector cast) operations which are always nops. Patch by Asiri Rathnayake. llvm-svn: 217138
* [x86] Teach the new v4i32 shuffle lowering some more tricks to recognizeChandler Carruth2014-09-041-1/+61
| | | | | | | | | | | | | | | vzext patterns and insert-element patterns that for SSE4 have dedicated instructions. With this we can enable the experimental mode in a regression test that happens to cover some of the past set of issues. You can see that the new logic does significantly better here on the floating point cases. A follow-up to this change and the previous ones will hoist the logic into helpers so it can be shared across element type sizes as in this particular case it generalizes cleanly. llvm-svn: 217136
* Fixed compilation problem on Windows (initialization of non-aggregate type).Elena Demikhovsky2014-09-041-6/+2
| | | | | | After commit 217131. llvm-svn: 217134
* X86 Intrinsics table - changed to a static table sorted by intrinsic id.Elena Demikhovsky2014-09-042-204/+226
| | | | | | Used binary search over the tables. llvm-svn: 217131
* [FastISel][AArch64] Cleanup and simplify 'fastSelectInstruction'. NFC.Juergen Ributzka2014-09-041-75/+12
| | | | llvm-svn: 217119
* [FastISel][AArch64] Add target-specific lowering for logical operations.Juergen Ributzka2014-09-041-26/+163
| | | | | | | | | This change adds support for immediate and shift-left folding into logical operations. This fixes rdar://problem/18223183. llvm-svn: 217118
* [x86] Teach the new vector shuffle lowering about the zero maskingChandler Carruth2014-09-041-22/+42
| | | | | | | | | | | abilities of INSERTPS which are really powerful and come up in very important contexts such as forming diagonal matrices, etc. With this I ended up being able to remove the somewhat weird helper I added for INSERTPS because we can collapse the entire state to a no-op mask. Added a bunch of tests for inserting into a zero-ish vector. llvm-svn: 217117
* R600/SI: Un-move pattern I forgot to remove in last commitMatt Arsenault2014-09-031-5/+5
| | | | llvm-svn: 217109
* R600/SI: Try to keep i32 mul on SALUMatt Arsenault2014-09-032-7/+16
| | | | | | | Also fix bug this exposed where when legalizing an immediate operand, a v_mov_b32 would be created with a VSrc dest register. llvm-svn: 217108
* [x86] Teach the new vector shuffle lowering about the simplest ofChandler Carruth2014-09-031-0/+29
| | | | | | | | | | 'insertps' patterns. This replaces two shuffles with a single insertps in very common cases. My next patch will extend this to leverage the zeroing capabilities of insertps which will allow it to be used in a much wider set of cases. llvm-svn: 217100
* [x86] Teach the asm comment printing to only print the clarification ofChandler Carruth2014-09-034-49/+64
| | | | | | | | | | an immediate operand when we don't have instruction-specific comments. This ensures that instruction-specific comments are attached to the same line as the instruction which is important for using them to write readable and maintainable tests. My next commit will just such a test. llvm-svn: 217099
* Refactor AtomicExpandPass and add a generic isAtomic() method to InstructionRobin Morisset2014-09-034-32/+51
| | | | | | | | | | | | | | | | | | | | | Summary: Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs. Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving the X86 backend to this pass. This requires a way of easily detecting which instructions are atomic. I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making isAtomic() a method of Instruction implemented by a switch on the opcodes. Test Plan: make check Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5035 llvm-svn: 217080
* Make some helpers static or move into the llvm namespace.Benjamin Kramer2014-09-032-4/+3
| | | | llvm-svn: 217077
* Use target-dependent emitLeading/TrailingFence instead of the ↵Robin Morisset2014-09-032-1/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | target-independent insertLeading/TrailingFence (in AtomicExpandPass) Fixes two latent bugs: - There was no fence inserted before expanded seq_cst load (unsound on Power) - There was only a fence release before seq_cst stores (again unsound, in particular on Power) It is not even clear if this is correct on ARM swift processors (where release fences are DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift as it is not clear whether it is incorrect. I would love to get documentation stating whether it is correct or not. These two bugs were not triggered because Power is not (yet) using this pass, and these behaviours happen to be (mostly?) working on ARM (although they completely butchered the semantics of the llvm IR). See: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html for an example of the problems that can be caused by the second of these bugs. I couldn't see a way of fixing these in a completely target-independent way without adding lots of unnecessary fences on ARM, hence the target-dependent parts of this patch. This patch implements the new target-dependent parts only for ARM (the default of not doing anything is enough for AArch64), other architectures will use this infrastructure in later patches. llvm-svn: 217076
* [FastISel][tblgen] Rename tblgen generated FastISel functions. NFC.Juergen Ributzka2014-09-035-85/+85
| | | | | | | | | | This is the final round of renaming. This changes tblgen to emit lower-case function names for FastEmitInst_* and FastEmit_*, and updates all its uses in the source code. Reviewed by Eric llvm-svn: 217075
* [FastISel] Rename public visible FastISel functions. NFC.Juergen Ributzka2014-09-035-147/+147
| | | | | | | | | | | | | | | | | | | | | This commit renames the following public FastISel functions: LowerArguments -> lowerArguments SelectInstruction -> selectInstruction TargetSelectInstruction -> fastSelectInstruction FastLowerArguments -> fastLowerArguments FastLowerCall -> fastLowerCall FastLowerIntrinsicCall -> fastLowerIntrinsicCall FastEmitZExtFromI1 -> fastEmitZExtFromI1 FastEmitBranch -> fastEmitBranch UpdateValueMap -> updateValueMap TargetMaterializeConstant -> fastMaterializeConstant TargetMaterializeAlloca -> fastMaterializeAlloca TargetMaterializeFloatZero -> fastMaterializeFloatZero LowerCallTo -> lowerCallTo Reviewed by Eric llvm-svn: 217074
* Remove resetSubtargetFeatures as it is unused.Eric Christopher2014-09-036-64/+9
| | | | llvm-svn: 217071
* Remove unnecessary getTarget call now that the subtarget is cachedEric Christopher2014-09-032-5/+3
| | | | | | on the machine function. llvm-svn: 217070
* [FastISel] Some long overdue spring cleaning of FastISel.Juergen Ributzka2014-09-031-30/+30
| | | | | | | | | | | | | Things got a little bit messy over the years and it is time for a little bit spring cleaning. This first commit is focused on the FastISel base class itself. It doxyfies all comments, C++11fies the code where it makes sense, renames internal methods to adhere to the coding standard, and clang-formats the files. Reviewed by Eric llvm-svn: 217060
* [FastISel][AArch64] Move unconditional branch handling into 'SelectBranch'. NFC.Juergen Ributzka2014-09-031-9/+7
| | | | llvm-svn: 217054
* R600/SI: Add a pattern for i64 and in a branchTom Stellard2014-09-031-0/+1
| | | | llvm-svn: 217041
* R600/SI: Fix typos in SIInstrInfo::areLoadsFromSameBasePtr()Tom Stellard2014-09-031-2/+2
| | | | | | | | | | This fixes a crash in the OpenCV test: ImgprocWarpResizeArea/Resize.Mat/16 There is no test case for this, because this failure depends on a specific ordering of the loads, which could easily change. llvm-svn: 217040
* Add override to overriden virtual methods, remove virtual keywords.Benjamin Kramer2014-09-0321-73/+50
| | | | | | No functionality change. Changes made by clang-tidy + some manual cleanup. llvm-svn: 217028
OpenPOWER on IntegriCloud