summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* GlobalISel: mark pointer stores as legal on AArch64.Tim Northover2016-09-141-1/+1
| | | | llvm-svn: 281448
* This reapplies r281304. The issue was that I had missedSjoerd Meijer2016-09-145-49/+52
| | | | | | to copy the new isAdd field in the tablegen data structure. llvm-svn: 281447
* AVX-512: Fixed a bug in kortest.z intrinsicElena Demikhovsky2016-09-141-1/+1
| | | | | | Lowering was wrong - X86ISD::SETCC node should return i8 type. llvm-svn: 281446
* [AVX512BW] Change truncStore action (v16i16->v16i18). It can be legal only ↵Igor Breger2016-09-141-2/+3
| | | | | | | | with AVX512VL. Differential Revision: http://reviews.llvm.org/D24547 llvm-svn: 281445
* [X86] Remove the VCVTSI2SD32 with rounding intrinsic. It's not used by clang ↵Craig Topper2016-09-141-1/+0
| | | | | | and not needed since 32-bit integer to double is always exact. llvm-svn: 281442
* [Hexagon] Better handling of HVX vector loweringKrzysztof Parzyszek2016-09-132-4/+17
| | | | | | | - Expand SELECT_CC and BR_CC for vector types. - Implement TLI::isShuffleMaskLegal. llvm-svn: 281397
* AArch64: Cleanup tailcall CC check, enable swiftcc.Matthias Braun2016-09-132-14/+20
| | | | | | | | | | | | | Cleanup/change the code that checks for possible tailcall conventions to look the same as the one in the X86 target. This makes the distinction between calling conventions that can guarnatee tailcalls and the ones that may tailcall more obvious. - Add Swift to the mayTailCall list - PreserveMost seemed to be incorrectly part of the guarnteed tail call list, move it to the mayTailCall list. llvm-svn: 281376
* AMDGPU: Remove code I think is deadMatt Arsenault2016-09-131-27/+3
| | | | | | | | As far as I can tell, resolveFrameIndex is supposed to be called with a legal offset, so inserting an add shouldn't be necessary. llvm-svn: 281372
* AMDGPU: Support commuting a FrameIndex operandMatt Arsenault2016-09-131-9/+16
| | | | llvm-svn: 281369
* Revert r281336 (and r281337), it caused PR30372.Nico Weber2016-09-137-106/+223
| | | | llvm-svn: 281361
* [Myriad]: set LeonCASA processor featureDouglas Katzman2016-09-131-6/+6
| | | | llvm-svn: 281359
* [Hexagon] Clear the flow queue after visiting a single instructionKrzysztof Parzyszek2016-09-131-0/+5
| | | | llvm-svn: 281339
* Defer asm errors to post-statement failureNirav Dave2016-09-137-223/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommitting after fixing AsmParser Initialization. Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281336
* Revert "[ARM] Promote small global constants to constant pools"James Molloy2016-09-134-144/+1
| | | | | | This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656 llvm-svn: 281327
* [ARM] Add ".code 32" to functions in the ARM instruction setPablo Barrio2016-09-131-1/+2
| | | | | | | | | | | | | | | | | | | Before, only Thumb functions were marked as ".code 16". These ".code x" directives are effective until the next directive of its kind is encountered. Therefore, in code with interleaved ARM and Thumb functions, it was possible to declare a function as ARM and end up with a Thumb function after assembly. A test has been added. An existing test has also been fixed to take this change into account. Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24337 llvm-svn: 281324
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-09-132-5/+138
| | | | | | | | | | | | | For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281323
* [ARM] Support ldr.w in pseudo instruction ldr rd,=immediatePeter Smith2016-09-132-0/+7
| | | | | | | | | | | | The changes made in r269352, r269353 and r269354 to support the transformation of the ldr rd,=immediate to mov introduced a regression from 3.8 (ldr.w rd, =immediate) not supported. This change puts support back in for ldr.w by means of a t2InstAlias for the .w form. The .w is ignored in ARM state and propagated to the ldr in Thumb2. llvm-svn: 281319
* [ARM] Promote small global constants to constant poolsJames Molloy2016-09-134-1/+144
| | | | | | | | | | | | | | | | | | | | | | | | If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281314
* Revert of r281304 as it is causing build bot failures in hexagonSjoerd Meijer2016-09-135-52/+49
| | | | | | | hwloop regression tests. These tests pass locally; will be investigating where these differences come from. llvm-svn: 281306
* This adds a new field isAdd to MCInstrDesc. The ARM and Hexagon instructionSjoerd Meijer2016-09-135-49/+52
| | | | | | | | | | | descriptions now tag add instructions, and the Hexagon backend is using this to identify loop induction statements. Patch by Sam Parker and Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D23601 llvm-svn: 281304
* AVX-512: Fix for PR28175 - Scalar code optimization.Elena Demikhovsky2016-09-132-7/+17
| | | | | | | | | Optimized (truncate (assertzext x) to i1) and anyext i1 to i8/16/32. Optimization of this patterns is a one more step towards i1 optimization on AVX-512. Differential Revision: https://reviews.llvm.org/D24456 llvm-svn: 281302
* [AArch64] Support stackmap/patchpoint in getInstSizeInBytesDiana Picus2016-09-131-4/+21
| | | | | | | | | | | | | | | | | We currently return 4 for stackmaps and patchpoints, which is very optimistic and can in rare cases cause the branch relaxation pass to fail to relax certain branches. This patch causes getInstSizeInBytes to return a pessimistic estimate of the size as the number of bytes requested in the stackmap/patchpoint. In the future, we could provide a more accurate estimate by sharing some of the logic in AArch64::LowerSTACKMAP/PATCHPOINT. Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750 Differential Revision: https://reviews.llvm.org/D24073 llvm-svn: 281301
* [X86] Remove masked shufpd/shufps intrinsics and autoupgrade to native ↵Craig Topper2016-09-131-12/+0
| | | | | | vector shuffles. They were removed from clang previously but accidentally left in the backend. llvm-svn: 281300
* X86: Conditional tail calls should not have isBarrier = 1Hans Wennborg2016-09-131-18/+31
| | | | | | | | | | That confuses e.g. machine basic block placement, which then doesn't realize that control can fall through a block that ends with a conditional tail call. Instead, isBranch=1 should be set. Also, mark EFLAGS as used by these instructions. llvm-svn: 281281
* Temporarily Revert "[MC] Defer asm errors to post-statement failure" as it's ↵Eric Christopher2016-09-137-106/+223
| | | | | | | | causing errors on the sanitizer bots. This reverts commit r281249. llvm-svn: 281280
* Revert r281215, it caused PR30358.Nico Weber2016-09-122-139/+5
| | | | llvm-svn: 281263
* [MC] Defer asm errors to post-statement failureNirav Dave2016-09-127-223/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281249
* AMDGPU: Do not clobber SCC in SIWholeQuadModeNicolai Haehnle2016-09-122-75/+210
| | | | | | | | | | Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22198 llvm-svn: 281230
* Revert "[ARM] Promote small global constants to constant pools"James Molloy2016-09-124-123/+1
| | | | | | This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625 llvm-svn: 281228
* [X86] Copy imp-uses when folding tailcall into conditional branch.Ahmed Bougacha2016-09-121-1/+1
| | | | | | | | | | | r280832 added 32-bit support for emitting conditional tail-calls, but dropped imp-used parameter registers. This went unnoticed until r281113, which added 64-bit support, as this is only exposed with parameter passing via registers. Don't drop the imp-used parameters. llvm-svn: 281223
* [AMDGPU] Assembler: Move disabled SDWA and DPP instruction into Disable asm ↵Sam Kolton2016-09-122-0/+12
| | | | | | | | | | | | | | variant Summary: This removes disabled instructions from match tables so we will not match them at all. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: wdng, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D24452 llvm-svn: 281216
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-09-122-5/+139
| | | | | | | | | | | | | For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281215
* [ARM] Promote small global constants to constant poolsJames Molloy2016-09-124-1/+123
| | | | | | | | | | | | | | | | | | | | | | | | If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281213
* Fix WebAssembly broken build related to interface change in r281172.Eric Liu2016-09-121-2/+1
| | | | | | | | | | Reviewers: bkramer Subscribers: jfb, llvm-commits, dschuff Differential Revision: https://reviews.llvm.org/D24449 llvm-svn: 281201
* CodeGen: Give MachineBasicBlock::reverse_iterator a handle to the current MIDuncan P. N. Exon Smith2016-09-118-39/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that MachineBasicBlock::reverse_instr_iterator knows when it's at the end (since r281168 and r281170), implement MachineBasicBlock::reverse_iterator directly on top of an ilist::reverse_iterator by adding an IsReverse template parameter to MachineInstrBundleIterator. This replaces another hard-to-reason-about use of std::reverse_iterator on list iterators, matching the changes for ilist::reverse_iterator from r280032 (see the "out of scope" section at the end of that commit message). MachineBasicBlock::reverse_iterator now has a handle to the current node and has obvious invalidation semantics. r280032 has a more detailed explanation of how list-style reverse iterators (invalidated when the pointed-at node is deleted) are different from vector-style reverse iterators like std::reverse_iterator (invalidated on every operation). A great motivating example is this commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp. Note: If your out-of-tree backend deletes instructions while iterating on a MachineBasicBlock::reverse_iterator or converts between MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator, you'll need to update your code in similar ways to r280032. The following table might help: [Old] ==> [New] delete &*RI, RE = end() delete &*RI++ RI->erase(), RE = end() RI++->erase() reverse_iterator(I) std::prev(I).getReverse() reverse_iterator(I) ++I.getReverse() --reverse_iterator(I) I.getReverse() reverse_iterator(std::next(I)) I.getReverse() RI.base() std::prev(RI).getReverse() RI.base() ++RI.getReverse() --RI.base() RI.getReverse() std::next(RI).base() RI.getReverse() (For more details, have a look at r280032.) llvm-svn: 281172
* [AVX512] Fix pattern for vgetmantsd and all other instructions that use same ↵Igor Breger2016-09-111-8/+1
| | | | | | | | class. Fix memory operand size, remove unnecessary pattern. Differential Revision: http://reviews.llvm.org/D24443 llvm-svn: 281164
* [AVX-512] Add VPTERNLOG to load folding tables.Craig Topper2016-09-111-0/+18
| | | | llvm-svn: 281156
* [X86] Make a helper method into a static function local to the cpp file.Craig Topper2016-09-112-11/+10
| | | | llvm-svn: 281154
* [NVPTX] Use ldg for explicitly invariant loads.Justin Lebar2016-09-111-13/+22
| | | | | | | | | | | | | | | | | | Summary: With this change (plus some changes to prevent !invariant from being clobbered within llvm), clang will be able to model the __ldg CUDA builtin as an invariant load, rather than as a target-specific llvm intrinsic. This will let the optimizer play with these loads -- specifically, we should be able to vectorize them in the load-store vectorizer. Reviewers: tra Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc Differential Revision: https://reviews.llvm.org/D23477 llvm-svn: 281152
* [CodeGen] Split out the notions of MI invariance and MI dereferenceability.Justin Lebar2016-09-1111-41/+67
| | | | | | | | | | | | | | | | | | | Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151
* We also need to pass swifterror in R12 under swiftcc not only under cccArnold Schwaighofer2016-09-101-0/+3
| | | | | | rdar://28190687 llvm-svn: 281138
* [AMDGPU] Refactor MUBUF/MTBUF instructionsValery Pykhtin2016-09-106-1168/+1306
| | | | | | Differential revision: https://reviews.llvm.org/D24295 llvm-svn: 281137
* [WebAssembly] Fix typos in commentsHeejin Ahn2016-09-101-11/+14
| | | | llvm-svn: 281131
* AMDGPU: Implement is{LoadFrom|StoreTo}FrameIndexMatt Arsenault2016-09-106-21/+90
| | | | llvm-svn: 281128
* AMDGPU: Fix scheduling info for spill pseudosMatt Arsenault2016-09-101-2/+3
| | | | | | | These defaulted to Write32Bit. I don't think this actually matters since these don't exist during scheduling. llvm-svn: 281127
* [CodeGen] Rename MachineInstr::isInvariantLoad to ↵Justin Lebar2016-09-102-2/+2
| | | | | | | | | | | | | | | | | | | | isDereferenceableInvariantLoad. NFC Summary: I want to separate out the notions of invariance and dereferenceability at the MI level, so that they correspond to the equivalent concepts at the IR level. (Currently an MI load is MI-invariant iff it's IR-invariant and IR-dereferenceable.) First step is renaming this function. Reviewers: chandlerc Subscribers: MatzeB, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D23370 llvm-svn: 281125
* AMDGPU: Fix immediate folding logic when shrinking instructionsMatt Arsenault2016-09-093-16/+10
| | | | | | | | | | If the literal is being folded into src0, it doesn't matter if it's an SGPR because it's being replaced with the literal. Also fixes initially selecting 32-bit versions of some instructions which also confused commuting. llvm-svn: 281117
* X86: Fold tail calls into conditional branches also for 64-bit (PR26302)Hans Wennborg2016-09-094-12/+40
| | | | | | | | | This extends the optimization in r280832 to also work for 64-bit. The only quirk is that we can't do this for 64-bit Windows (yet). Differential Revision: https://reviews.llvm.org/D24423 llvm-svn: 281113
* AMDGPU: Run LoadStoreVectorizer pass by defaultMatt Arsenault2016-09-092-1/+4
| | | | llvm-svn: 281112
* [X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targetsSimon Pilgrim2016-09-091-5/+5
| | | | | | Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets llvm-svn: 281105
OpenPOWER on IntegriCloud