summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Do not test for CPUs, use SubtargetFeatures (Part 1). NFCIDiana Picus2016-06-236-25/+91
| | | | | | | | | | | | | This is a cleanup commit similar to r271555, but for ARM. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. Since the ARM backend seems to have quite a lot of calls to these methods, I intend to submit 5-6 subtarget features at a time, instead of one big lump. Differential Revision: http://reviews.llvm.org/D21432 llvm-svn: 273544
* [AVX512] Remove masked unpack intrinsics and autoupgrade to vectorshuffle ↵Craig Topper2016-06-232-72/+10
| | | | | | and selects. llvm-svn: 273543
* [X86] Add assert to ensure only 128-bit vector types are used. 256 or ↵Craig Topper2016-06-231-0/+2
| | | | | | 512-bit would require lane handling which is missing. llvm-svn: 273542
* Use C++ comments for large block comment.Eric Christopher2016-06-231-16/+17
| | | | llvm-svn: 273526
* AMDGPU: readlane/writelane do not read execMatt Arsenault2016-06-232-2/+26
| | | | llvm-svn: 273525
* Revert r273456, "Preserve DebugInfo when replacing values in DAGCombiner" as ↵Peter Collingbourne2016-06-232-2/+4
| | | | | | it caused pr28270. llvm-svn: 273518
* [codeview] Add EFLAGS to the list of CodeView register numbersReid Kleckner2016-06-221-1/+3
| | | | llvm-svn: 273516
* AMDGPU: Fix liveness when expanding m0 loopMatt Arsenault2016-06-222-23/+67
| | | | llvm-svn: 273514
* Prune some includes from headers and sink some inline functionsReid Kleckner2016-06-224-0/+4
| | | | | | | | MCSymbol.h shouldn't pull in MCAssembler.h, just MCFragment.h. MCLinkerOptimizationHint.h shouldn't need MCMachObjectWriter.h. The rest is fixing the fallout. llvm-svn: 273507
* Use shouldAssumeDSOLocal.Rafael Espindola2016-06-221-5/+5
| | | | | | With this it handle -fPIE. llvm-svn: 273499
* Extract a few variables to make 'if' smaller. NFC.Rafael Espindola2016-06-221-7/+8
| | | | llvm-svn: 273497
* AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32Changpeng Fang2016-06-221-0/+12
| | | | | | | | Reviewers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D21533 llvm-svn: 273496
* AMDGPU: Run verifier after 2nd run of SIShrinkInstructionsMatt Arsenault2016-06-221-1/+1
| | | | llvm-svn: 273469
* AMDGPU: Fix verifier errors in SILowerControlFlowMatt Arsenault2016-06-2210-133/+217
| | | | | | | | | | | | | The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467
* [Hexagon] Add SDAG preprocessing step to expose shifted addressing modesKrzysztof Parzyszek2016-06-221-1/+54
| | | | | | | | | | | Transform: (store ch addr (add x (add (shl y c) e))) to: (store ch addr (add x (shl (add y d) c))), where e = (shl d c) for some integer d. The purpose of this is to enable generation of loads/stores with shifted addressing mode, i.e. mem(x+y<<#c). For that, the shift value c must be 0, 1 or 2. llvm-svn: 273466
* [AArch64] Remove an overly aggressive assert.Chad Rosier2016-06-221-5/+0
| | | | llvm-svn: 273458
* Start using shouldAssumeDSOLocal on Hexagon.Rafael Espindola2016-06-221-2/+3
| | | | | | | Include a token test showing that access to private is now the same as to internal. llvm-svn: 273457
* Preserve DebugInfo when replacing values in DAGCombinerNirav Dave2016-06-222-4/+2
| | | | | | | | | | | | | | | | | | | | | Recommiting after fixing over-aggressive assertion [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 273456
* AMDGPU: Make FrameLowering stack alignment 16Matt Arsenault2016-06-221-3/+4
| | | | | | | We don't need it to be that high. The natural alignment for a single workitem's stack is 16. llvm-svn: 273448
* [SystemZ] Recognize RISBG opportunities involving a truncateZhan Jun Liau2016-06-221-3/+22
| | | | | | | | | | | | | | | | | | | Summary: Recognize RISBG opportunities where the end result is narrower than the original input - where a truncate separates the shift/and operations. The motivating case is some code in postgres which looks like: srlg %r2, %r0, 11 nilh %r2, 255 Reviewers: uweigand Author: RolandF Differential Revision: http://reviews.llvm.org/D21452 llvm-svn: 273433
* [Hexagon] Handle expansion of cmpxchgKrzysztof Parzyszek2016-06-222-0/+12
| | | | llvm-svn: 273432
* [SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCalleeKrzysztof Parzyszek2016-06-2213-22/+20
| | | | | | | | | | | The setCallee function will set the number of fixed arguments based on the size of the argument list. The FixedArgs parameter was often explicitly set to 0, leading to a lack of consistent value for non- vararg functions. Differential Revision: http://reviews.llvm.org/D20376 llvm-svn: 273403
* Delete more dead code.Rafael Espindola2016-06-224-115/+0
| | | | | | Found by gcc 6. llvm-svn: 273402
* AMDGPU: Fix gcc warningsMatt Arsenault2016-06-224-197/+60
| | | | | | | Mostly removing dead code. Apparently gcc's warning for unused functions is better llvm-svn: 273363
* [Kryo] Enable loop prefetcher.Haicheng Wu2016-06-211-0/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D21535 llvm-svn: 273329
* Delete more dead code.Rafael Espindola2016-06-219-382/+0
| | | | | | Found by gcc 6. llvm-svn: 273322
* AMDGPU: Add implicitarg.ptr intrinsic.Jan Vesely2016-06-213-11/+24
| | | | | | | | Points to the start of implicit arguments (appended after explicit arguments) Differential Revision: http://reviews.llvm.org/D20297 llvm-svn: 273317
* [NVPTX] Improve lowering of byval args of device functions.Artem Belevich2016-06-213-15/+79
| | | | | | | | | | | | | | | | | | | | | Avoid unnecessary spills of such vars to local space on SASS level and pointer space conversion. Instead, make a local copy with appropriate addrspacecasts and let LLVM optimize them away when possible. This allows loading value of the argument using [symbol+offset] instead of converting argument to general space pointer and using it for indexing (which also implicitly converts param space pointer to local space one on SASS level and triggers copying of argument into local space in the process). This reduces call overhead, uses less registers and reduces overall SASS size by 2-4%. Differential Review: http://reviews.llvm.org/D21421 llvm-svn: 273313
* Add back some dead code.Rafael Espindola2016-06-211-0/+14
| | | | | | | It was there just to avoid warnings. Add a LLVM_ATTRIBUTE_UNUSED attribute so that it doesn't produce warnings with gcc 6. llvm-svn: 273308
* Delete some dead code.Rafael Espindola2016-06-215-64/+0
| | | | | | Found by gcc 6. llvm-svn: 273303
* [StackProtector] Fix computation of GSCookieOffset and EHCookieOffset with SEH4Etienne Bergeron2016-06-212-7/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fix the computation of the offsets present in the scopetable when using the SEH (__except_handler4). This patch added an intrinsic to track the position of the allocation on the stack of the EHGuard. This position is needed when producing the ScopeTable. ``` struct _EH4_SCOPETABLE { DWORD GSCookieOffset; DWORD GSCookieXOROffset; DWORD EHCookieOffset; DWORD EHCookieXOROffset; _EH4_SCOPETABLE_RECORD ScopeRecord[1]; }; struct _EH4_SCOPETABLE_RECORD { DWORD EnclosingLevel; long (*FilterFunc)(); union { void (*HandlerAddress)(); void (*FinallyFunc)(); }; }; ``` The code to generate the EHCookie is added in `X86WinEHState.cpp`. Which is adding these instructions when using SEH4. ``` Lfunc_begin0: # BB#0: # %entry pushl %ebp movl %esp, %ebp pushl %ebx pushl %edi pushl %esi subl $28, %esp movl %ebp, %eax <<-- Loading FramePtr movl %esp, -36(%ebp) movl $-2, -16(%ebp) movl $L__ehtable$use_except_handler4_ssp, %ecx xorl ___security_cookie, %ecx movl %ecx, -20(%ebp) xorl ___security_cookie, %eax <<-- XOR FramePtr and Cookie movl %eax, -40(%ebp) <<-- Storing EHGuard leal -28(%ebp), %eax movl $__except_handler4, -24(%ebp) movl %fs:0, %ecx movl %ecx, -28(%ebp) movl %eax, %fs:0 movl $0, -16(%ebp) calll _may_throw_or_crash LBB1_1: # %cont movl -28(%ebp), %eax movl %eax, %fs:0 addl $28, %esp popl %esi popl %edi popl %ebx popl %ebp retl ``` And the corresponding offset is computed: ``` Luse_except_handler4_ssp$parent_frame_offset = -36 .p2align 2 L__ehtable$use_except_handler4_ssp: .long -2 # GSCookieOffset .long 0 # GSCookieXOROffset .long -40 # EHCookieOffset <<---- .long 0 # EHCookieXOROffset .long -2 # ToState .long _catchall_filt # FilterFunction .long LBB1_2 # ExceptionHandler ``` Clang is not yet producing function using SEH4, but it's a work in progress. This patch is a step toward having a valid implementation of SEH4. Unfortunately, it is not yet fully working. The EH registration block is not allocated at the right offset on the stack. Reviewers: rnk, majnemer Subscribers: llvm-commits, chrisha Differential Revision: http://reviews.llvm.org/D21231 llvm-svn: 273281
* [AArch64] Change the preferred alignment for char and short to word alignmentEvandro Menezes2016-06-211-2/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D21414 llvm-svn: 273279
* [AArch64] Restore codegen for AArch64 Cortex-A72/A73 after NFCISilviu Baranga2016-06-213-2/+24
| | | | | | | | | | | | | | | | | | | | | Summary: Code generation for Cortex-A72/Cortex-A73 was accidentally changed by r271555, which was a NFCI. The isCortexA57() predicate was not true for Cortex-A72/Cortex-A73 before r271555 (since it was checking the CPU string). Because Cortex-A72/Cortex-A73 inherit all features from Cortex-A57, all decisions previously guarded by isCortexA57() are now taken. This change restores the behaviour before r271555 by adding separate ProcA72/ProcA73, which have the required features to preserve code generation. Reviewers: kristof.beyls, aadg, mcrosier, rengolin Subscribers: mcrosier, llvm-commits, aemerson, t.p.northover, MatzeB, rengolin Differential Revision: http://reviews.llvm.org/D21182 llvm-svn: 273277
* fix indentationEtienne Bergeron2016-06-211-1/+1
| | | | llvm-svn: 273274
* Define a isPositionIndependent helper for ARMAsmPrinter. NFC.Rafael Espindola2016-06-212-2/+8
| | | | llvm-svn: 273261
* [AVX512] Add patterns for any-extending a mask that use the def of ↵Craig Topper2016-06-211-0/+6
| | | | | | KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX. llvm-svn: 273253
* Replace silly uses of 'signed' with 'int'David Majnemer2016-06-216-16/+13
| | | | llvm-svn: 273244
* [AVX512] Remove the masked vpcmpeq/vcmpgt intrinsics and autoupgrade them to ↵Craig Topper2016-06-211-24/+0
| | | | | | native icmps. llvm-svn: 273240
* [X86] Pre-allocate SmallVector instead of using push_back in a loop. NFCCraig Topper2016-06-211-17/+13
| | | | llvm-svn: 273234
* Simplify PICStyles.Rafael Espindola2016-06-202-20/+10
| | | | | | | | The main difference is that StubDynamicNoPIC is gone. The dynamic-no-pic mode as the name implies is simply not pic. It is just conservative about what it assumes to be dso local. llvm-svn: 273222
* [X86][SSE] Add cost model for BSWAP of vectorsSimon Pilgrim2016-06-201-3/+24
| | | | | | | | The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization. Differential Revision: http://reviews.llvm.org/D21521 llvm-svn: 273217
* [X86][X87] Fix issue with sitofp i64 -> fp128 on 32-bit targetsSimon Pilgrim2016-06-201-2/+2
| | | | | | | | Fix for PR27726 - sitofp i64 to fp128 was loading the merged load i64 to a x87 register preventing legalization for conversion to fp128. Added 32-bit tests for fp128 cast/conversions. llvm-svn: 273210
* Delete dead code. NFC.Rafael Espindola2016-06-201-8/+0
| | | | llvm-svn: 273206
* Add a isPositionIndependent helper to ARMFastISel. NFC.Rafael Espindola2016-06-201-8/+13
| | | | llvm-svn: 273187
* [AArch64] Adjust the loop buffer size for Exynos M1 (NFC)Evandro Menezes2016-06-201-1/+1
| | | | llvm-svn: 273185
* AMDGPU: Preserve undef flag on vcc when shrinking v_cndmask_b32Matt Arsenault2016-06-201-16/+13
| | | | | | | | | The implicit operand is added by the initial instruction construction, so this was adding an additional vcc use. The original one was missing the undef flag the original condition had, so the verifier would complain. llvm-svn: 273182
* AMDGPU: Fold more custom nodes to undefMatt Arsenault2016-06-201-11/+40
| | | | | | | | | | | This will help sneak undefs past GVN into the DAG for some tests. Also add missing intrinsic for rsq_legacy, even though the node was already selected to the instruction. Also start passing the debug location to intrinsic errors. llvm-svn: 273181
* Generalize DiagnosticInfoStackSize to support other limitsMatt Arsenault2016-06-201-3/+11
| | | | | | | Backends may want to report errors on resources other than stack size. llvm-svn: 273177
* AMDGPU: Use correct method for determining instruction sizeMatt Arsenault2016-06-201-2/+4
| | | | llvm-svn: 273172
* Use shouldAssumeDSOLocal.Rafael Espindola2016-06-201-1/+3
| | | | | | With this ARM fast isel knows that PIE variable are not preemptable. llvm-svn: 273169
OpenPOWER on IntegriCloud