summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* Change split code gen to use ThreadPoolTeresa Johnson2016-03-041-32/+40
| | | | | | Part of D15390. llvm-svn: 262719
* [X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining testsSimon Pilgrim2016-03-041-0/+72
| | | | | | None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining llvm-svn: 262718
* Test commit accessSam Kolton2016-03-041-1/+1
| | | | llvm-svn: 262714
* [X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing ↵Simon Pilgrim2016-03-041-3/+20
| | | | | | elements from the second input of a binary shuffle (punpcklbw) llvm-svn: 262710
* test commitValery Pykhtin2016-03-041-1/+1
| | | | llvm-svn: 262709
* Make headers self-contained again.Benjamin Kramer2016-03-042-0/+2
| | | | llvm-svn: 262702
* AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsicsNikolay Haustov2016-03-047-32/+339
| | | | | | | | | | | These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701
* Annotate our undefined behaviour to sneak it past the sanitizersJustin Bogner2016-03-043-1/+14
| | | | | | | | | | | We have known UB in some ilists where we static cast half nodes to (larger) derived types and use the address. See llvm.org/PR26753. This needs to be fixed, but in the meantime it'd be nice if running ubsan didn't complain. This adds annotations in the two places where ubsan complains while running check-all of a sanitized clang build. llvm-svn: 262683
* Fix a memory leak.Easwaran Raman2016-03-041-1/+4
| | | | llvm-svn: 262682
* CodeGen: Tune the SmallVector size in LiveRangeJustin Bogner2016-03-041-2/+2
| | | | | | | | | | | The vast majority of LiveRanges (ie, 4/5) have exactly 1 segment and 1 value number, and a good chunk of the rest have 2 of each, so allocating space for 4 is wasteful. This is especially noticeable when dealing with a very large number of vregs, and I have an internal case where dropping this to 2 shaves over 5% off of peak memory when compiling a particularly large function. llvm-svn: 262681
* Fix a use-after-free bug introduced in r262636Easwaran Raman2016-03-043-7/+16
| | | | llvm-svn: 262679
* Add hardware_concurrency interface to llvm::thread (NFC)Teresa Johnson2016-03-041-0/+1
| | | | | | Part of D15390. llvm-svn: 262677
* [gold] Handle modules that are not included in the link.Evgeniy Stepanov2016-03-041-75/+92
| | | | | | | | | | Gold has a newly added LDPT_GET_SYMBOLS_V3 callback that can distinguish between a module that is not included in the link, and one that is included but has its entire interface preempted by others. Fixes PR26674. llvm-svn: 262676
* Fix memory leak in tests.Easwaran Raman2016-03-032-0/+2
| | | | llvm-svn: 262674
* [libfuzzer] arbitrary function adapter.Mike Aizatsky2016-03-035-0/+299
| | | | | | | | | The adapter automates converting sequence of bytes into arbitrary arguments. Differential Revision: http://reviews.llvm.org/D17829 llvm-svn: 262673
* [docs] Add a description of current problem areas to the statepoint docsPhilip Reames2016-03-031-0/+35
| | | | | | Triggered by a question on llvm-dev about status llvm-svn: 262671
* [InstCombine] Combine A->B->A BitCastGuozhi Wei2016-03-033-0/+197
| | | | | | | | | | This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670
* llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify ↵NAKAMURA Takumi2016-03-031-1/+1
| | | | | | | | | | explicit triple. We will see it for targeting win32; LLVM ERROR: CPU: 'generic' does not support ARM mode execution! llvm-svn: 262668
* [libFuzzer] when interrupted, call _Exit() instead of exit()Kostya Serebryany2016-03-031-1/+1
| | | | llvm-svn: 262667
* [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.Simon Pilgrim2016-03-032-3/+18
| | | | | | PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661
* [RuntimeDyld] Fix '_' stripping in ↵Lang Hames2016-03-032-27/+13
| | | | | | | | | | | | | | | | | | | | | | | RTDyldMemoryManager::getSymbolAddressInProcess. The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a linker-mangled symbol name, but it calls through to dlsym to do the lookup (via DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled symbol name. Historically we've attempted to "demangle" by removing leading '_'s on all platforms, and fallen back to an extra search if that failed. That's broken, as it can cause symbols to resolve incorrectly on platforms that don't do mangling if you query '_foo' and the process also happens to contain a 'foo'. Fix this by demangling conditionally based on the host platform. That's safe here because this function is specifically for symbols in the host process, so the usual cross-process JIT looking concerns don't apply. M unittests/ExecutionEngine/ExecutionEngineTest.cpp M lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp llvm-svn: 262657
* [ValueTracking] "constant fold" an experimental hidden optionPhilip Reames2016-03-031-7/+0
| | | | llvm-svn: 262648
* [ValueTracking] Remove dead code from an old experimentPhilip Reames2016-03-035-455/+2
| | | | | | | | | | This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits. While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default. Given this, it's time to abandon the experiment. Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments. Given how easy the patch is to apply, there's no reason to leave the code in tree. For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads. In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach. llvm-svn: 262646
* [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)Sanjay Patel2016-03-033-23/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645
* Fix breakage caused by r262636.Easwaran Raman2016-03-031-1/+1
| | | | | | Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused)) llvm-svn: 262643
* [ConstantRange] Rename test; NFCSanjoy Das2016-03-031-1/+1
| | | | llvm-svn: 262640
* [SCEV] Prove no-overflow via constant rangesSanjoy Das2016-03-033-0/+95
| | | | | | | Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639
* [SCEV] Be less eager about demoting zexts to sextsSanjoy Das2016-03-032-4/+28
| | | | | | | | | | | | After r262438 we can have provably positive NSW SCEV expressions whose zero extensions cannot be simplified (since r262438 makes SCEV better at computing constant ranges). This means demoting sexts of positive add recurrences eagerly can result in an unsimplified zero extension where we could have had a simplified sign extension. This change fixes the issue by teaching SCEV to demote sext of a positive SCEV expression to a zext only if the sext could not be simplified. llvm-svn: 262638
* [ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on rangesSanjoy Das2016-03-033-22/+89
| | | | | | | This will be used in a later patch to ScalarEvolution. Right now only the unit tests exercise the newly added code. llvm-svn: 262637
* Infrastructure for PGO enhancements in inlinerEaswaran Raman2016-03-0311-52/+433
| | | | | | | | | | | | This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
* [X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPSSimon Pilgrim2016-03-034-28/+35
| | | | | | | | | | The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635
* Use LineLocation instead of CallsiteLocation to index callsite profile.Dehao Chen2016-03-037-84/+53
| | | | | | | | | | | | Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples). Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17827 llvm-svn: 262634
* [X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.Simon Pilgrim2016-03-031-14/+2
| | | | | | lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway. llvm-svn: 262633
* [X86] Pulled out repeated code testing for constant vector shift amount. NFCI.Simon Pilgrim2016-03-031-8/+6
| | | | llvm-svn: 262631
* MCU target has its own ABI, however X86 interrupt handler calling convention ↵Amjad Aboud2016-03-031-1/+3
| | | | | | | | | | overrides this ABI. Fixed the ordering to check first for X86 interrupt handler then for MCU target. Differential Revision: http://reviews.llvm.org/D17801 llvm-svn: 262628
* [X86] Don't assume that shuffle non-mask operands starts at #0.Ahmed Bougacha2016-03-033-32/+95
| | | | | | | | | | | | | | | | | | | | | | | | | That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627
* [LoopUtils, LV] Fix PR26734Matthew Simpson2016-03-032-1/+50
| | | | | | | | The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case. llvm-svn: 262624
* [AArch64] fold 'isPositive' vector integer operations (PR26819)Sanjay Patel2016-03-032-16/+37
| | | | | | | | | | | | | | | | This is one of the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26819 Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might come from autovectorization even if it's unlikely to be produced from C NEON intrinsics. The patch is based on the x86 equivalent: http://reviews.llvm.org/rL262036 Differential Revision: http://reviews.llvm.org/D17834 llvm-svn: 262623
* AVX512: Combine AND + TESTM instructions .Igor Breger2016-03-034-8/+86
| | | | | | Differential Revision: http://reviews.llvm.org/D17844 llvm-svn: 262621
* Making rem_crash.ll target-specificRenato Golin2016-03-033-1/+516
| | | | | | | | | | This test failed in some ARM bots after a divmod change because it was running on a native llc, instead of targeted one. This makes sure the test is target-specific (as intended), and also copies to ARM and AArch64 directories. If it is also supposed to work on other architectures, I'll leave as an exercise to the respective maintainers. llvm-svn: 262620
* [AVR] Add calling convention parser tokensDylan McKay2016-03-036-0/+25
| | | | | | | | | | | | Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser. Reviewers: arsenm Subscribers: dylanmckay, llvm-commits Differential Revision: http://reviews.llvm.org/D16348 llvm-svn: 262600
* [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREGSimon Pilgrim2016-03-034-43/+60
| | | | | | | | Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 262599
* Revert "[ARM] Merging 64-bit divmod lib calls into one"Renato Golin2016-03-033-14/+2
| | | | | | This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594
* [LLVM][AVX512] PSRLWI Chnage imm8 to intMichael Zuckerman2016-03-034-21/+21
| | | | | | Differential Revision: http://reviews.llvm.org/D17753 llvm-svn: 262592
* TTI: Fix not using overload of getIntrinsicInstrCostMatt Arsenault2016-03-031-1/+1
| | | | | | | This was always calling the generic version, so the target custom implementation was never called. llvm-svn: 262585
* [BranchFolding] Change function name related with merging MMOs. NFCJunmo Park2016-03-031-7/+5
| | | | | | | | | | | Summary: Removing MMOs is not our prefer behavior any more. Reviewers: mcrosier, reames Differential Revision: http://reviews.llvm.org/D17668 llvm-svn: 262580
* AMDGPU: Insert two S_NOP instructions for every high level source statement.Tom Stellard2016-03-034-0/+110
| | | | | | | | | | | | | | Patch by: Konstantin Zhuravlyov Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input. Reviewers: tstellarAMD, arsenm Subscribers: echristo, dblaikie, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17454 llvm-svn: 262579
* AMDGPU/SI: Don't try to move scratch wave offset when there are no free SGPRsTom Stellard2016-03-031-3/+15
| | | | | | | | | | | | | | Summary: When there were no free SGPRs, we were trying to move this value into some of the reserved registers which was causing a segmentation fault. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17590 llvm-svn: 262577
* [X86] Enable forwarding bool arguments in tail calls (PR26305)Hans Wennborg2016-03-032-0/+47
| | | | | | | | | The code was previously not able to track a boolean argument at a call site back to the formal argument of the caller. Differential Revision: http://reviews.llvm.org/D17786 llvm-svn: 262575
* [PPCVSXFMAMutate] Temporarily disable this passTim Shen2016-03-035-11/+17
| | | | llvm-svn: 262573
OpenPOWER on IntegriCloud