summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Cleanup FCOPYSIGN lowering. NFC intended.Ahmed Bougacha2014-12-051-29/+15
| | | | llvm-svn: 223542
* Recommit of r223513 and r223514.Kuba Brecka2014-12-051-34/+48
| | | | | | Reviewed at http://reviews.llvm.org/D6488 llvm-svn: 223532
* [Hexagon] Relocating logical instructions and templates later in the td file.Colin LeMahieu2014-12-051-116/+115
| | | | llvm-svn: 223523
* [Hexagon] Adding sub/and/or reg, imm formsColin LeMahieu2014-12-051-29/+56
| | | | llvm-svn: 223522
* Remove dead code. We are only lazy about functions with bodies.Rafael Espindola2014-12-051-7/+1
| | | | llvm-svn: 223521
* Reverting r223513 and r223514.Kuba Brecka2014-12-051-48/+34
| | | | llvm-svn: 223520
* Optimize merging of scalar loads for 32-byte vectors [X86, AVX]Sanjay Patel2014-12-051-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the poor codegen seen in PR21710 ( http://llvm.org/bugs/show_bug.cgi?id=21710 ). Before we crack 32-byte build vectors into smaller chunks (and then subsequently glue them back together), we should look for the easy case where we can just load all elements in a single op. An example of the codegen change is: From: vmovss 16(%rdi), %xmm1 vmovups (%rdi), %xmm0 vinsertps $16, 20(%rdi), %xmm1, %xmm1 vinsertps $32, 24(%rdi), %xmm1, %xmm1 vinsertps $48, 28(%rdi), %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 retq To: vmovups (%rdi), %ymm0 retq Differential Revision: http://reviews.llvm.org/D6536 llvm-svn: 223518
* [DFSAN][MIPS][LLVM] Defining ShadowPtrMask variable for MIPS64Peter Collingbourne2014-12-051-1/+12
| | | | | | | | | | | Patch by Kumar Sukhani! corresponding compiler-rt patch: http://reviews.llvm.org/D6437 clang patch: http://reviews.llvm.org/D6147 Differential Revision: http://reviews.llvm.org/D6459 llvm-svn: 223516
* [Hexagon] Updating mux_ir/ri/ii/rr with encoding bitsColin LeMahieu2014-12-054-46/+78
| | | | llvm-svn: 223515
* AddressSanitizer - Don't instrument globals from cstring_literals sections. ↵Kuba Brecka2014-12-051-34/+48
| | | | | | | | (llvm part) Reviewed at http://reviews.llvm.org/D6488 llvm-svn: 223513
* Simplify the loop linking function bodies. NFC.Rafael Espindola2014-12-051-37/+21
| | | | llvm-svn: 223512
* Use 32-bit ebp for NaCl64 in a limited case: llvm.frameaddress.Jan Wen Voung2014-12-054-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Follow up to [x32] "Use ebp/esp as frame and stack pointer": http://reviews.llvm.org/D4617 In that earlier patch, NaCl64 was made to always use rbp. That's needed for most cases because rbp should hold a full 64-bit address within the NaCl sandbox so that load/stores off of rbp don't require sandbox adjustment (zeroing the top 32-bits, then filling those by adding r15). However, llvm.frameaddress returns a pointer and pointers are 32-bit for NaCl64. In this case, use ebp instead, which will make the register copy type check. A similar mechanism may be needed for llvm.eh.return, but is not added in this change. Test Plan: test/CodeGen/X86/frameaddr.ll Reviewers: dschuff, nadav Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D6514 llvm-svn: 223510
* [PowerPC]Add VSX loads/stores to fastisel for PPC targetBill Seurer2014-12-051-4/+36
| | | | | | | | | | This patch adds VSX floating point loads and stores to fastisel. Along with the change to tablegen (D6220), VSX instructions are now fully supported in fastisel. http://reviews.llvm.org/D6274 llvm-svn: 223507
* [Hexagon] Adding tfrih/l instructions.Colin LeMahieu2014-12-051-0/+22
| | | | llvm-svn: 223506
* [X86] Improved lowering of packed vector shifts to vpsllq/vpsrlq.Andrea Di Biagio2014-12-051-10/+17
| | | | | | | | | | | | | | SSE2/AVX non-constant packed shift instructions only use the lower 64-bit of the shift count. This patch teaches function 'getTargetVShiftNode' how to deal with shifts where the shift count node is of type MVT::i64. Before this patch, function 'getTargetVShiftNode' only knew how to deal with shift count nodes of type MVT::i32. This forced the backend to wrongly truncate the shift count to MVT::i32, and then zero-extend it back to MVT::i64. llvm-svn: 223505
* [Hexagon] Adding add reg, imm form with encoding bits and test.Colin LeMahieu2014-12-051-42/+80
| | | | llvm-svn: 223504
* Remove unused arguments. NFC.Rafael Espindola2014-12-051-9/+7
| | | | llvm-svn: 223503
* These two calls were grabbing the same register info. Unify them.Eric Christopher2014-12-051-3/+2
| | | | llvm-svn: 223502
* BFI: Saturate when combining edges to a successorDuncan P. N. Exon Smith2014-12-051-4/+17
| | | | | | | | | | | | When a loop gets bundled up, its outgoing edges are quite large, and can just barely overflow 64-bits. If one successor has multiple incoming edges -- and that successor is getting all the incoming mass -- combining just its edges can overflow. Handle that by saturating rather than asserting. This fixes PR21622. llvm-svn: 223500
* [Hexagon] Adding DoubleRegs decoder. Moving C2_mux and A2_nop. Adding ↵Colin LeMahieu2014-12-053-10/+60
| | | | | | combine imm-imm form. llvm-svn: 223494
* Fix a bug when pretty-printing DW_OP_deref.Adrian Prantl2014-12-051-0/+3
| | | | llvm-svn: 223493
* [CodeGenPrepare] Use variables for reused values. NFC.Ahmed Bougacha2014-12-051-4/+6
| | | | llvm-svn: 223491
* [Hexagon] [NFC] Rearranging patterns and mux instruction.Colin LeMahieu2014-12-051-38/+38
| | | | llvm-svn: 223488
* [Hexagon] [NFC] Rearranging def order.Colin LeMahieu2014-12-051-28/+27
| | | | llvm-svn: 223487
* Refactor duplicated code. NFC.Rafael Espindola2014-12-051-26/+16
| | | | llvm-svn: 223486
* [Hexagon] Adding combine reg-reg forms.Colin LeMahieu2014-12-051-1/+14
| | | | llvm-svn: 223485
* [Hexagon] Marking several instructions as isCodeGenOnly=0 and adding direct ↵Colin LeMahieu2014-12-051-2/+3
| | | | | | disassembly tests for many instructions. llvm-svn: 223482
* LLVMContext: Store APInt/APFloat directly into the ConstantInt/FP DenseMaps.Benjamin Kramer2014-12-052-56/+24
| | | | | | | | Required some APInt massaging to get proper empty/tombstone values. Apart from making the code a bit simpler this also reduces the bucket size of the ConstantInt map from 32 to 24 bytes. llvm-svn: 223478
* Small cleanup on how we clear constant variables. NFC.Rafael Espindola2014-12-051-14/+9
| | | | llvm-svn: 223474
* Use an early return. NFC.Rafael Espindola2014-12-051-19/+19
| | | | llvm-svn: 223470
* [msan] Avoid extra origin address realignment.Evgeniy Stepanov2014-12-051-21/+24
| | | | | | | | | Do not realign origin address if the corresponding application address is at least 4-byte-aligned. Saves 2.5% code size in track-origins mode. llvm-svn: 223464
* [X86] Avoid introducing extra shuffles when lowering packed vector shifts.Andrea Di Biagio2014-12-051-15/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When lowering a vector shift node, the backend checks if the shift count is a shuffle with a splat mask. If so, then it introduces an extra dag node to extract the splat value from the shuffle. The splat value is then used to generate a shift count of a target specific shift. However, if we know that the shift count is a splat shuffle, we can use the splat index 'I' to extract the I-th element from the first shuffle operand. The advantage is that the splat shuffle may become dead since we no longer use it. Example: ;; define <4 x i32> @example(<4 x i32> %a, <4 x i32> %b) { %c = shufflevector <4 x i32> %b, <4 x i32> undef, <4 x i32> zeroinitializer %shl = shl <4 x i32> %a, %c ret <4 x i32> %shl } ;; Before this patch, llc generated the following code (-mattr=+avx): vpshufd $0, %xmm1, %xmm1 # xmm1 = xmm1[0,0,0,0] vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq With this patch, the redundant splat operation is removed from the code. vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq llvm-svn: 223461
* Add missing FP build attribute tests.Charlie Turner2014-12-051-0/+2
| | | | | | | | | | | | | | | | | | | | | The test file test/CodeGen/ARM/build-attributes.ll was missing several floating-point build attribute tests. The intention of this commit is that for each CPU / architecture currently tested, there are now tests that make sure the following attributes are sufficiently checked, * Tag_ABI_FP_rounding * Tag_ABI_FP_denormal * Tag_ABI_FP_exceptions * Tag_ABI_FP_user_exceptions * Tag_ABI_FP_number_model Also in this commit, the -unsafe-fp-math flag has been augmented with the full suite of flags Clang sends to LLVM when you pass -ffast-math to Clang. That is, `-unsafe-fp-math' has been changed to `-enable-unsafe-fp-math -disable-fp-elim -enable-no-infs-fp-math -enable-no-nans-fp-math -fp-contract=fast' Change-Id: I35d766076bcbbf09021021c0a534bf8bf9a32dfc llvm-svn: 223454
* Revert "r223440 - Consider subregs when calling MI::registerDefIsDead for ↵Hal Finkel2014-12-051-7/+1
| | | | | | | | | phys deps" Reverting this because, while it fixes the problem in the reduced test case, it does not fix the problem in the full test case from the bug report. llvm-svn: 223442
* Consider subregs when calling MI::registerDefIsDead for phys depsHal Finkel2014-12-051-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The scheduling dependency graph is built bottom-up within each scheduling region, and ScheduleDAGInstrs::addPhysRegDeps is called to add output/anti dependencies, based on physical registers, to the SUs for instructions based on those that come before them. In the test case, we start before post-RA scheduling with a block that looks like this: ... INLINEASM <... andc $0,$0,$2 stdcx. $0,0,$3 bne- 1b > [sideeffect] [mayload] [maystore] [attdialect], $0:[regdef-ec:G8RC], %X6<earlyclobber,def,dead>, $1:[mem], %X3<kill>, $2:[reguse:G8RC], %X5<kill>, $3:[reguse:G8RC], %X3, $4:[mem], %X3, $5:[clobber], %CC<earlyclobber,imp-def,dead>, <<badref>> ... %X4<def,dead> = ANDIo8 %X4<kill>, 1, %CR0<imp-def,dead>, %CR0GT<imp-def> ... %R29<def> = ISEL %R3<undef>, %R4<kill>, %CR0GT<kill> where it is relevant that %CC is an alias to %CR0, and that %CR0GT is a subregister of %CR0. However, for post-RA scheduling, no dependency was added to prevent the INLINEASM from being scheduled in between the ANDIo8 and the ISEL (which communicate via the %CR0GT register). In ScheduleDAGInstrs::addPhysRegDeps, when called for the %CC operand, we'd iterate over all of its aliases (which include %CC itself and also %CR0), and look for previously-encountered defs of those registers. We'd find the ANDIo8, but decide not to add a dependency between the INLINEASM and the ANDIo8 because both the INLINEASM's def of %CC is dead, and also the ANDIo8 def of %CR0 is dead. This ignores, however, that ANDIo8 has a non-dead def of %CR0GT, a subregister of %CR0, and thus a dependency still must exist. To fix this problem, when calling registerDefIsDead on the SU with the def, we also check all subregisters for possible non-dead defs, and add the dependency if any are found. Fixes PR21742. llvm-svn: 223440
* IR: Stop relying on GetStringMapEntryFromValue()Duncan P. N. Exon Smith2014-12-051-1/+3
| | | | | | It relies on undefined behaviour. llvm-svn: 223438
* Cleanup: Calls to getDwarfRegNum() may actually fail, if there isAdrian Prantl2014-12-053-27/+44
| | | | | | | | | | | | | no DWARF register number mapping, or if the register was a virtual register that was never materialized. Previously, we would just emit a bogus location, after this patch we don't emit a location at all by doing an early exit. After my bugfix in r223401 today, this doesn't actually happen on any target that I tested this with, but it's still preferable to make the possibility of a failure explicit. llvm-svn: 223428
* linkGlobalVariableProto never returns null. Simplify the caller. NFC.Rafael Espindola2014-12-051-6/+3
| | | | llvm-svn: 223424
* Rename the x86 isTargetMacho to isTargetMachO for uniformity.Eric Christopher2014-12-054-8/+8
| | | | llvm-svn: 223421
* Both of these subtargets have functions that check whether orEric Christopher2014-12-052-3/+2
| | | | | | not the target is mach-o. Use them. llvm-svn: 223420
* Move merging of alignment to a central location. NFC.Rafael Espindola2014-12-051-19/+3
| | | | llvm-svn: 223418
* [X86] Delete dead code in fcopysign lowering. NFC.Ahmed Bougacha2014-12-041-11/+0
| | | | | | | | | r32900 introduced custom lowering for fcopysign, with two checks to change the magnitude value's type if it's larger/smaller than the sign value's type. r32932 replaced that code for the smaller case. r43205 did the same for the larger case, but left the old code, now dead. llvm-svn: 223415
* Simplify implementation and testcase of r223401 based on feedback from dblaikie.Adrian Prantl2014-12-041-4/+2
| | | | llvm-svn: 223405
* Debug info: If the RegisterCoalescer::reMaterializeTrivialDef() isAdrian Prantl2014-12-041-1/+13
| | | | | | | eliminating all uses of a vreg, update any DBG_VALUE describing that vreg to point to the rematerialized register instead. llvm-svn: 223401
* Add a FIXME as requested by Renato Golin.Roman Divacky2014-12-041-0/+3
| | | | llvm-svn: 223390
* Silence warning: variable 'buffer' set but not used.Yaron Keren2014-12-041-3/+5
| | | | llvm-svn: 223389
* [x86] Fix isOffsetSuitableForCodeModel kernel code model offsetBruno Cardoso Lopes2014-12-041-1/+1
| | | | | | | Offset == 0 is a valid offset for kernel code model according to the x86_64 System V ABI. Found by inspection, no testcase. llvm-svn: 223383
* [AArch64] Combining Load and IntToFp should check for neon availabilityWeiming Zhao2014-12-041-3/+4
| | | | llvm-svn: 223382
* Fix yet another unseen regression caused by r223113Asiri Rathnayake2014-12-041-14/+26
| | | | | | | | | | r223113 added support for ARM modified immediate assembly syntax. Which assumes all immediate operands are prefixed with a '#'. This assumption is wrong as per the ARMARM - which recommends that all '#' characters be treated optional. The current patch fixes this regression and adds a test case. A follow-up patch will expand the test coverage to other instructions. llvm-svn: 223381
* Fix thumbv4t indirect callsJonathan Roelofs2014-12-042-11/+50
| | | | | | | | | | | | | | | | | | | | | So there are a couple of issues with indirect calls on thumbv4t. First, the most 'obvious' instruction, 'blx' isn't available until v5t. And secondly, the next-most-obvious sequence: 'mov lr, pc; bx rN' doesn't DTRT in thumb code because the saved off pc has its thumb bit cleared, so when the callee returns we end up in ARM mode.... yuck. The solution is to 'bl' to a nearby landing pad with a 'bx rN' in it. We could cut down on code size by sharing the landing pads between call sites that are close enough, but for the moment let's do correctness first and look at performance later. Patch by: Iain Sandoe http://reviews.llvm.org/D6519 llvm-svn: 223380
OpenPOWER on IntegriCloud