summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Correct ordering of loads/stores.Alina Sbirlea2016-07-116-22/+212
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Aiming to correct the ordering of loads/stores. This patch changes the insert point for loads to the position of the first load. It updates the ordering method for loads to insert before, rather than after. Before this patch the following sequence: "load a[1], store a[1], store a[0], load a[2]" Would incorrectly vectorize to "store a[0,1], load a[1,2]". The correctness check was assuming the insertion point for loads is at the position of the first load, when in practice it was at the last load. An alternative fix would have been to invert the correctness check. The current fix changes insert position but also requires reordering of instructions before the vectorized load. Updated testcases to reflect the changes. Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22071 llvm-svn: 275117
* ARM: validate immediate branch targets in AsmParser.Tim Northover2016-07-119-51/+164
| | | | | | | | | | Immediate branch targets aren't commonly used, but if they are we should make sure they can actually be encoded. This means they must be divisible by 2 when targeting Thumb mode, and by 4 when targeting ARM mode. Also do a little naming cleanup while I was changing everything around anyway. llvm-svn: 275116
* Prevent the creation of empty (forwarding) blocks resulting from nested ifs.Wolfgang Pieb2016-07-112-1/+49
| | | | | | | | | | | | | | | | | Summary: Nested if statements can generate empty BBs whose terminator branches unconditionally to its successor. These branches are not eliminated to help generate better line number information in some cases, but there is no reason to keep the empty blocks that result from nested ifs. Reviewers: mehdi_amini, dblaikie, echristo Subscribers: mehdi_amini, cfe-commits Differential review: http://reviews.llvm.org/D11360 llvm-svn: 275115
* Don't compute modulus of hash if it is smaller than the bucket count.Eric Fiselier2016-07-111-5/+4
| | | | | | | | This cleans up a previous optimization attempt in hash, and results in additional performance improvements over that previous attempt. Additionally this new optimization does not hinder the power of 2 bucket count optimization. llvm-svn: 275114
* AMDGPU: Treat texture gather instructions more like other MIMG instructionsNicolai Haehnle2016-07-116-5/+38
| | | | | | | | | | | | | | | | | | | | | Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210 llvm-svn: 275113
* remove empty linesEtienne Bergeron2016-07-111-2/+0
| | | | llvm-svn: 275112
* [compiler-rt] Fix VisualStudio virtual folders layoutEtienne Bergeron2016-07-1129-47/+76
| | | | | | | | | | | | | | | | | | | | Summary: This patch is a refactoring of the way cmake 'targets' are grouped. It won't affect non-UI cmake-generators. Clang/LLVM are using a structured way to group targets which ease navigation through Visual Studio UI. The Compiler-RT projects differ from the way Clang/LLVM are grouping targets. This patch doesn't contain behavior changes. Reviewers: kubabrecka, rnk Subscribers: wang0109, llvm-commits, kubabrecka, chrisha Differential Revision: http://reviews.llvm.org/D21952 llvm-svn: 275111
* Refactor the PDB writing to use a builder approachZachary Turner2016-07-1121-179/+658
| | | | llvm-svn: 275110
* [pdb] Add a pdb2yaml option to not dump file headers.Zachary Turner2016-07-115-18/+39
| | | | | | | | | This will be useful once we start adding the ability to dump type records and symbol records, since it will allow us to generate mergeable information instead of information that specifies an entire file. llvm-svn: 275109
* AMDGPU: fix local stack slot allocation bugsNicolai Haehnle2016-07-113-2/+42
| | | | | | | | | | | | | | | | | | | | | | | Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551 llvm-svn: 275108
* [asan] Add exception handler to map memory on demand on Win64.Etienne Bergeron2016-07-116-4/+67
| | | | | | | | | | Memory will be committed on demand when exception happens while accessing shadow memeory region. Patch by: Wei Wang Differential Revision: http://reviews.llvm.org/D21942 llvm-svn: 275107
* [X86] Make some cast costs more preciseMichael Kuperstein2016-07-115-40/+53
| | | | | | | | | Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106
* Always use the allocator to construct/destruct elements of a deque/vector. ↵Marshall Clow2016-07-116-6/+125
| | | | | | Fixes PR#28412. Thanks to Jonathan Wakely for the report. llvm-svn: 275105
* Codegen: Fix comment in BranchFolding.cppKyle Butt2016-07-111-7/+6
| | | | | | | | Blocks to be tail-merged may share more than one successor. Correct the comment to state that they share a specific successor, SuccBB, rather than a single successor, which is not true. llvm-svn: 275104
* [X86] Fix tailcall return address clobber bug.Quentin Colombet2016-07-114-11/+59
| | | | | | | | | | | | | | | | | | | | | | | | This bug (llvm.org/PR28124) was introduced by r237977, which refactored the tail call sequence to be generated in two passes instead of one. Unfortunately, the stack adjustment produced by the first pass was not recognized by X86FrameLowering::mergeSPUpdates() in all cases, causing code such as the following, which clobbers the return address, to be generated: popl %edi popl %edi pushl %eax jmp tailcallee # TAILCALL To fix the problem, the entire stack adjustment is performed in X86ExpandPseudo::ExpandMI() for tail calls. Patch by Magnus Lång <margnus1@gmail.com> Differential Revision: http://reviews.llvm.org/D21325 llvm-svn: 275103
* fix documentation comments; NFCSanjay Patel2016-07-113-85/+46
| | | | llvm-svn: 275101
* Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizerAlina Sbirlea2016-07-116-38/+71
| | | | | | | | | | | | | Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains. Add additional parameters: AddressSpace, Alignment, Fast. Reviewers: llvm-commits, jlebar Subscribers: arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21935 llvm-svn: 275100
* [X86] Disable FixupSetCC for CodeGenOpt::NoneMichael Kuperstein2016-07-112-4/+19
| | | | | | | | | | It is an optimization pass, and should not run at -O0. Especially since Fast RA will not do the required register coalescing anyway, so it's a loss even from the optimization standpoint. This also works around (but doesn't quite fix) PR28489. llvm-svn: 275099
* [compiler-rt] Refactor the interception code on windows.Etienne Bergeron2016-07-111-2/+4
| | | | | | | | | | | | | | | | | | | | | | | [asan] Fix unittest Asan-x86_64-inline-Test crashing on Windows64 REAL(memcpy) was used in several places in Asan, while REAL(memmove) was not used. This CL chooses to patch memcpy() first, solving the crash for unittest. The crash looks like this: projects\compiler-rt\lib\asan\tests\default\Asan-x86_64-inline-Test.exe ================================================================= ==22680==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x000000000000 bp 0x0029d555f590 sp 0x0029d555f438 T0) ==22680==Hint: pc points to the zero page. AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: access-violation (<unknown module>) ==22680==ABORTING Patch by: Wei Wang Differential Revision: http://reviews.llvm.org/D22232 llvm-svn: 275098
* [NFC] Reorder fields of VersionTuple to reduce sizeErik Pilkington2016-07-111-15/+20
| | | | | | Differential revision: http://reviews.llvm.org/D19934 llvm-svn: 275095
* Allow is_swappable to SFINAE on deleted/ambiguous swap functionsEric Fiselier2016-07-112-4/+30
| | | | llvm-svn: 275094
* Hide some internal symbols for memory resource.Eric Fiselier2016-07-111-0/+5
| | | | llvm-svn: 275089
* [IPRA] Properly compute register usage at call sites.Chad Rosier2016-07-114-8/+11
| | | | | | | | Differential Revision: http://reviews.llvm.org/D21395 Patch by Vivek Pandya. PR28144 llvm-svn: 275087
* [SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunitiesZhan Jun Liau2016-07-1111-2/+294
| | | | | | | | | | | | | | | | | | Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 llvm-svn: 275086
* Fix a build warning of unhandled enum in switchWeiming Zhao2016-07-112-0/+4
| | | | | | | | | | | | Summary: LLVM adds a new value FMRB_DoesNotReadMemory in the enumeration. Reviewers: andrew.w.kaylor, chrisj, zinob, grosser, jdoerfert Subscribers: Meinersbur, pollydev Differential Revision: http://reviews.llvm.org/D22109 llvm-svn: 275085
* [SCCP] Try to follow the DRY principle, use `OpSt`.Davide Italiano2016-07-111-3/+2
| | | | | | Thanks to Eli Friedman for pointing out in his post-commit review! llvm-svn: 275084
* [SLSR] Call getPointerSizeInBits with the correct address space.Jingyue Wu2016-07-112-5/+22
| | | | llvm-svn: 275083
* [PM/IPO] Port LowerTypeTests to the new PassManager.Davide Italiano2016-07-115-17/+39
| | | | | | | | There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. llvm-svn: 275082
* [lanai] Add more tests for assembly of conditional ALU opsJacques Pienaar2016-07-114-5/+363
| | | | llvm-svn: 275081
* Fix an issue where one could not define a Python command with the same name ↵Enrico Granata2016-07-114-2/+56
| | | | | | as an existing alias (or rather, one could but the results of invoking the command were far from satisfactory) llvm-svn: 275080
* Fix the assertion failure caused by http://reviews.llvm.org/D22118Dehao Chen2016-07-112-2/+3
| | | | | | | | | | | | Summary: http://reviews.llvm.org/D22118 uses metadata to store the call count, which makes it possible to have branch weight to have only one elements. Also fix the assertion failure in inliner when checking the instruction type to include "invoke" instruction. Reviewers: mkuper, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22228 llvm-svn: 275079
* [Sema] Don't artificially forbid BuiltinTemplateDecls in CheckTemplateArgumentDavid Majnemer2016-07-112-8/+7
| | | | | | | | After thinking about it, we don't really need to forbid BuiltinTemplateDecls explicitly. The restriction doesn't really buy us anything. llvm-svn: 275078
* [IR] Stop a -Wsign-compare warning from firingDavid Majnemer2016-07-111-1/+1
| | | | llvm-svn: 275077
* [man page] Document -gline-tables-only in the clang man page.Adrian Prantl2016-07-111-12/+22
| | | | llvm-svn: 275076
* [man page] Fix two sphinx build errors.Adrian Prantl2016-07-111-2/+2
| | | | | | These options were referenced by other paragraphs, but never specified. llvm-svn: 275075
* [LowerTypeTests] Don't rely on doInitialization().Davide Italiano2016-07-111-23/+16
| | | | | | | | | In preparation for porting this pass to the new PM (which has no doInitialization()). Differential Revision: http://reviews.llvm.org/D22223 llvm-svn: 275074
* Implement callsite-hotness based inline cost for Sample-based PGODehao Chen2016-07-115-1/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073
* Tune the weight propagation algorithm for sample profile.Dehao Chen2016-07-112-16/+30
| | | | | | | | | | | | Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 llvm-svn: 275072
* [tsan] Add support for GCD IO channels on DarwinKuba Brecka2016-07-117-9/+574
| | | | | | | | This patch adds interceptors for dispatch_io_*, dispatch_read and dispatch_write functions. This avoids false positives when using GCD IO. Adding several test cases. Differential Revision: http://reviews.llvm.org/D21889 llvm-svn: 275071
* [x86] make some of the tests 256-bit for testing diversitySanjay Patel2016-07-111-54/+106
| | | | llvm-svn: 275070
* Add missing include from previous commitNirav Dave2016-07-111-0/+1
| | | | llvm-svn: 275069
* Fix branch relaxation in 16-bit mode.Nirav Dave2016-07-1117-48/+115
| | | | | | | | | | | | | | | Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 llvm-svn: 275068
* [x86] specify triple to avoid bot failuresSanjay Patel2016-07-111-6/+6
| | | | llvm-svn: 275067
* [Sink] Don't move calls to readonly functions across storesNicolai Haehnle2016-07-112-2/+118
| | | | | | | | | | | | Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 llvm-svn: 275066
* AliasAnalysis: unify getModRefInfo(I, CS) semantics with other overloadsNicolai Haehnle2016-07-111-1/+1
| | | | | | | | | | | | | | | | | | | This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when the Instruction is a call-site. This is now more in line with getModRefInfo generally: it returns Mod when I modifies a memory location that is accessed (read or written) by CS and Ref when I reads a memory location that is written by CS. From a grep of the code, the only uses of this particular getModRefInfo overload are in MemorySSA and MemCpyOptimizer, and they only care about where the result is MR_NoModRef or not. Therefore, this change should have no visible effect. Separated out from D17279 upon request. llvm-svn: 275065
* [x86] update checksSanjay Patel2016-07-111-15/+30
| | | | llvm-svn: 275064
* Changes related to tooling::applyAllReplacements interface change in D21601.Eric Liu2016-07-115-27/+56
| | | | | | | | | | | | | | | Summary: this patch contains changes related to the interface change from http://reviews.llvm.org/D21601. Only submit this patch after D21601 is submitted. Reviewers: djasper, klimek Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D21602 llvm-svn: 275063
* Make tooling::applyAllReplacements return llvm::Expected<string> instead of ↵Eric Liu2016-07-1116-102/+131
| | | | | | | | | | | | | | | | empty string to indicate potential error. Summary: return llvm::Expected<> to carry error status and error information. This is the first step towards introducing "Error" into tooling::Replacements. Reviewers: djasper, klimek Subscribers: ioeric, klimek, cfe-commits Differential Revision: http://reviews.llvm.org/D21601 llvm-svn: 275062
* [OpenCL] Improved diagnostics of OpenCL types.Anastasia Stulova2016-07-118-54/+95
| | | | | | | | | | | | | | - Changes diagnostics for Blocks to be implicitly const qualified OpenCL v2.0 s6.12.5. - Added and unified diagnostics of some OpenCL special types: blocks, images, samplers, pipes. These types are intended for use with the OpenCL builtin functions only and, therefore, most regular uses are not allowed including assignments, arithmetic operations, pointer dereferencing, etc. Review: http://reviews.llvm.org/D21989 llvm-svn: 275061
* Change the /proc/<pid>/maps to not assert on incorrect inputTamas Berghammer2016-07-111-12/+9
| | | | | | | | | | If LLDB reads some incorrect input form /proc/<pid>/maps then it should report an error instead of assert-ing as we don't want to crash in case of an incorrect maps file. Differential revision: http://reviews.llvm.org/D22211 llvm-svn: 275060
OpenPOWER on IntegriCloud