summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Refactor the PDB writing to use a builder approachZachary Turner2016-07-117-121/+276
| | | | llvm-svn: 275110
* AMDGPU: fix local stack slot allocation bugsNicolai Haehnle2016-07-111-2/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551 llvm-svn: 275108
* [X86] Make some cast costs more preciseMichael Kuperstein2016-07-111-3/+16
| | | | | | | | | Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106
* Codegen: Fix comment in BranchFolding.cppKyle Butt2016-07-111-7/+6
| | | | | | | | Blocks to be tail-merged may share more than one successor. Correct the comment to state that they share a specific successor, SuccBB, rather than a single successor, which is not true. llvm-svn: 275104
* [X86] Fix tailcall return address clobber bug.Quentin Colombet2016-07-112-11/+30
| | | | | | | | | | | | | | | | | | | | | | | | This bug (llvm.org/PR28124) was introduced by r237977, which refactored the tail call sequence to be generated in two passes instead of one. Unfortunately, the stack adjustment produced by the first pass was not recognized by X86FrameLowering::mergeSPUpdates() in all cases, causing code such as the following, which clobbers the return address, to be generated: popl %edi popl %edi pushl %eax jmp tailcallee # TAILCALL To fix the problem, the entire stack adjustment is performed in X86ExpandPseudo::ExpandMI() for tail calls. Patch by Magnus Lång <margnus1@gmail.com> Differential Revision: http://reviews.llvm.org/D21325 llvm-svn: 275103
* fix documentation comments; NFCSanjay Patel2016-07-111-42/+7
| | | | llvm-svn: 275101
* Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizerAlina Sbirlea2016-07-112-24/+42
| | | | | | | | | | | | | Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains. Add additional parameters: AddressSpace, Alignment, Fast. Reviewers: llvm-commits, jlebar Subscribers: arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21935 llvm-svn: 275100
* [X86] Disable FixupSetCC for CodeGenOpt::NoneMichael Kuperstein2016-07-111-4/+4
| | | | | | | | | | It is an optimization pass, and should not run at -O0. Especially since Fast RA will not do the required register coalescing anyway, so it's a loss even from the optimization standpoint. This also works around (but doesn't quite fix) PR28489. llvm-svn: 275099
* [IPRA] Properly compute register usage at call sites.Chad Rosier2016-07-112-4/+6
| | | | | | | | Differential Revision: http://reviews.llvm.org/D21395 Patch by Vivek Pandya. PR28144 llvm-svn: 275087
* [SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunitiesZhan Jun Liau2016-07-116-2/+73
| | | | | | | | | | | | | | | | | | Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 llvm-svn: 275086
* [SCCP] Try to follow the DRY principle, use `OpSt`.Davide Italiano2016-07-111-3/+2
| | | | | | Thanks to Eli Friedman for pointing out in his post-commit review! llvm-svn: 275084
* [SLSR] Call getPointerSizeInBits with the correct address space.Jingyue Wu2016-07-111-2/+2
| | | | llvm-svn: 275083
* [PM/IPO] Port LowerTypeTests to the new PassManager.Davide Italiano2016-07-113-17/+30
| | | | | | | | There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. llvm-svn: 275082
* Fix the assertion failure caused by http://reviews.llvm.org/D22118Dehao Chen2016-07-112-2/+3
| | | | | | | | | | | | Summary: http://reviews.llvm.org/D22118 uses metadata to store the call count, which makes it possible to have branch weight to have only one elements. Also fix the assertion failure in inliner when checking the instruction type to include "invoke" instruction. Reviewers: mkuper, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22228 llvm-svn: 275079
* [IR] Stop a -Wsign-compare warning from firingDavid Majnemer2016-07-111-1/+1
| | | | llvm-svn: 275077
* [LowerTypeTests] Don't rely on doInitialization().Davide Italiano2016-07-111-23/+16
| | | | | | | | | In preparation for porting this pass to the new PM (which has no doInitialization()). Differential Revision: http://reviews.llvm.org/D22223 llvm-svn: 275074
* Implement callsite-hotness based inline cost for Sample-based PGODehao Chen2016-07-113-1/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073
* Tune the weight propagation algorithm for sample profile.Dehao Chen2016-07-111-12/+26
| | | | | | | | | | | | Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 llvm-svn: 275072
* Add missing include from previous commitNirav Dave2016-07-111-0/+1
| | | | llvm-svn: 275069
* Fix branch relaxation in 16-bit mode.Nirav Dave2016-07-1115-47/+81
| | | | | | | | | | | | | | | Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 llvm-svn: 275068
* [Sink] Don't move calls to readonly functions across storesNicolai Haehnle2016-07-111-2/+6
| | | | | | | | | | | | Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 llvm-svn: 275066
* AliasAnalysis: unify getModRefInfo(I, CS) semantics with other overloadsNicolai Haehnle2016-07-111-1/+1
| | | | | | | | | | | | | | | | | | | This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when the Instruction is a call-site. This is now more in line with getModRefInfo generally: it returns Mod when I modifies a memory location that is accessed (read or written) by CS and Ref when I reads a memory location that is written by CS. From a grep of the code, the only uses of this particular getModRefInfo overload are in MemorySSA and MemCpyOptimizer, and they only care about where the result is MR_NoModRef or not. Therefore, this change should have no visible effect. Separated out from D17279 upon request. llvm-svn: 275065
* [X86][SSE] Generalise target shuffle combine of shuffles using variable masksSimon Pilgrim2016-07-111-13/+21
| | | | | | At present the only shuffle with a variable mask we recognise is PSHUFB, which influences if its worth the cost of mask creation/loading of a combined target shuffle with a variable mask. This change sets up the infrastructure to support other shuffles in the future but has no effect yet. llvm-svn: 275059
* Provide support for preserving assembly commentsNirav Dave2016-07-116-3/+72
| | | | | | | | | | | | | | | | | Preserve assembly comments from input in output assembly and flags to toggle property. This is on by default for inline assembly and off in llvm-mc. Parsed comments are emitted immediately before an EOL which generally places them on the expected line. Reviewers: rtrieu, dwmw2, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20020 llvm-svn: 275058
* [AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions.Artem Tamazov2016-07-111-0/+2
| | | | | | | | | | Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 llvm-svn: 275054
* [mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and ↵Zlatko Buljan2016-07-1115-58/+324
| | | | | | | | SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 llvm-svn: 275050
* AVX-512: DAG lowering for scalar MIN/MAX commutable opsElena Demikhovsky2016-07-111-3/+36
| | | | | | | | DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. llvm-svn: 275048
* [AVX512] Add support for 512-bit ANDN now that all ones build vectors ↵Craig Topper2016-07-111-1/+2
| | | | | | survive long enough to allow the matching. llvm-svn: 275046
* [AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one ↵Craig Topper2016-07-113-5/+19
| | | | | | vectors. llvm-svn: 275045
* [X86] Add the AVX512 SET0 pseudos to foldMemoryOperandImpl since they are ↵Craig Topper2016-07-112-3/+14
| | | | | | | | marked for CanFoldAsLoad. I don't really know how to test this. llvm-svn: 275044
* Revert r275027 - Let FuncAttrs infer the 'returned' argument attributeHal Finkel2016-07-111-50/+0
| | | | | | Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. llvm-svn: 275042
* Pointer-comparison folding should look through returned-argument functionsHal Finkel2016-07-111-0/+5
| | | | | | | | | For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 llvm-svn: 275039
* Teach isDereferenceablePointer to look through returned-argument functionsHal Finkel2016-07-111-0/+5
| | | | | | | | | For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038
* Teach SCEV to look through returned-argument functionsHal Finkel2016-07-111-0/+7
| | | | | | | | | When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037
* Teach computeKnownBits to look through returned-argument functionsHal Finkel2016-07-111-3/+8
| | | | | | | | | If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 llvm-svn: 275036
* BasicAA should look through functions with returned argumentsHal Finkel2016-07-113-2/+32
| | | | | | | | | | | Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035
* Don't use a SmallSet for returned attribute inferenceHal Finkel2016-07-111-11/+19
| | | | | | | Suggested post-commit by David Majnemer on IRC (following-up on a pre-commit review comment). llvm-svn: 275033
* Add getReturnedArgOperand to Call/InvokeInst, CallSiteHal Finkel2016-07-102-2/+31
| | | | | | | | | | | | | | | In order to make the optimizer smarter about using the 'returned' argument attribute (generally, but motivated by my llvm.noalias intrinsic work), add a utility function to Call/InvokeInst, and CallSite, to make it easy to get the returned call argument (when one exists). P.S. There is already an unfortunate amount of code duplication between CallInst and InvokeInst, and this adds to it. We should probably clean that up separately. Differential Revision: http://reviews.llvm.org/D22204 llvm-svn: 275031
* [X86][SSE] Relax type assertions for matchVectorShuffleAsInsertPSSimon Pilgrim2016-07-101-2/+4
| | | | | | Calls to matchVectorShuffleAsInsertPS only need to ensure the inputs are 128-bit vectors. Only lowerVectorShuffleAsInsertPS needs to ensure that they are v4f32. llvm-svn: 275028
* Let FuncAttrs infer the 'returned' argument attributeHal Finkel2016-07-101-0/+42
| | | | | | | | | | A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 llvm-svn: 275027
* [DAG] make isConstantSplatVector() available to the rest of loweringSanjay Patel2016-07-102-32/+25
| | | | llvm-svn: 275025
* AMDGPU/R600: Add implicitarg.ptr intrinsicJan Vesely2016-07-103-6/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D21622 llvm-svn: 275024
* [X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHWSimon Pilgrim2016-07-101-3/+48
| | | | llvm-svn: 275022
* fix documentation comments; NFCSanjay Patel2016-07-101-11/+3
| | | | llvm-svn: 275021
* [Support] Make helper function static. NFC.Benjamin Kramer2016-07-101-2/+2
| | | | llvm-svn: 275017
* [SystemZ] Utilize Test Data Class instructions.Marcin Koscielnicki2016-07-106-4/+432
| | | | | | | | | | | This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32|64|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016
* reformat, fix comments/names; NFCISanjay Patel2016-07-101-27/+22
| | | | llvm-svn: 275015
* Give helper classes/functions internal linkage. NFC.Benjamin Kramer2016-07-107-12/+25
| | | | llvm-svn: 275014
* [AVX512] Add support for lowering to 512-bit SHUFPS.Craig Topper2016-07-101-4/+8
| | | | llvm-svn: 275011
* [pdb] Sanity check the stream mapDavid Majnemer2016-07-101-1/+7
| | | | | | | | Some abstractions in LLVM "know" that they are reading in-bounds, FixedStreamArray, and provide a simple result. This breaks down if the stream map is bogus. llvm-svn: 275010
OpenPOWER on IntegriCloud