summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [WinEH] Replace more lpad value uses with undefReid Kleckner2015-04-231-9/+20
| | | | | | | | | | | | | | | | | | | | | We were asserting on code like this: extern "C" unsigned long _exception_code(); void might_crash(unsigned long); void foo() { __try { might_crash(0); } __except(1) { might_crash(_exception_code()); } } Gtest and many other libraries get the exception code from the __except block. What's supposed to happen here is that EAX is live into the __except block, and it contains the exception code. Eventually we'll represent that as a use of the landingpad ehptr value, but for now we can replace it with undef. llvm-svn: 235649
* [MachineCopyPropagation] Handle undef flags conservatively so that we do notQuentin Colombet2015-04-231-1/+5
| | | | | | | | | | | | | | | | | | | | | remove copies that are useful after breaking some hardware dependencies. In other words, handle this kind of situations conservatively by assuming reg2 is redefined by the undef flag. reg1 = copy reg2 = inst reg2<undef> reg2 = copy reg1 Copy propagation used to remove the last copy. This is incorrect because the undef flag on reg2 in inst, allows next passes to put whatever trashed value in reg2 that may help. In practice we end up with this code: reg1 = copy reg2 reg2 = 0 = inst reg2<undef> reg2 = copy reg1 This fixes PR21743. llvm-svn: 235647
* Unbreak buildKrzysztof Parzyszek2015-04-231-1/+1
| | | | llvm-svn: 235646
* [Hexagon] Minor cleanup in HexagonFrameLoweringKrzysztof Parzyszek2015-04-231-6/+2
| | | | llvm-svn: 235645
* R600/SI: Fix indirect addressing with a negative constant offsetTom Stellard2015-04-231-16/+55
| | | | | | | | | | | When the base register index of the vector plus the constant offset was less than zero, we were passing the wrong base register to the indirect addressing instruction. In this case, we need to set the base register to v0 and then add the computed (negative) index to m0. llvm-svn: 235641
* Thumb2: When applying branch optimizations, visit branches in reverse order.Peter Collingbourne2015-04-231-2/+7
| | | | | | | | | | | | The order in which branches appear in ImmBranches is approximately their order within the function body. By visiting later branches first, we reduce the distance between earlier forward branches and their targets, making it more likely that the cbn?z optimization, which can only apply to forward branches, will succeed for those earlier branches. Differential Revision: http://reviews.llvm.org/D9185 llvm-svn: 235640
* ARM: When re-creating a branch via InsertBranch, preserve CPSR flags.Peter Collingbourne2015-04-231-2/+4
| | | | | | | | | | | | | In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639
* Thumb2: When optimizing for size, do not if-convert branches involving ↵Peter Collingbourne2015-04-231-0/+27
| | | | | | | | | | | comparisons with zero. This allows the constant island pass to lower these branches to cbn?z instructions, resulting in a shorter instruction sequence. Differential Revision: http://reviews.llvm.org/D9183 llvm-svn: 235638
* ARM: When spilling extra registers for alignment, prefer low registers on ↵Peter Collingbourne2015-04-231-2/+2
| | | | | | | | | | | all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637
* ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools.Peter Collingbourne2015-04-231-18/+1
| | | | | | | | | | | | | | | | | This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636
* [Hexagon] Fix compiler warnings in release buildKrzysztof Parzyszek2015-04-232-1/+6
| | | | | | Patch by Aditya Nandakumar. llvm-svn: 235635
* [getUnderlyingOjbects] Analyze loop PHIs further to remove false positivesAdam Nemet2015-04-232-11/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Specifically, if a pointer accesses different underlying objects in each iteration, don't look through the phi node defining the pointer. The motivating case is the underlyling-objects-2.ll testcase. Consider the loop nest: int **A; for (i) for (j) A[i][j] = A[i-1][j] * B[j] This loop is transformed by Load-PRE to stash away A[i] for the next iteration of the outer loop: Curr = A[0]; // Prev_0 for (i: 1..N) { Prev = Curr; // Prev = PHI (Prev_0, Curr) Curr = A[i]; for (j: 0..N) Curr[j] = Prev[j] * B[j] } Since A[i] and A[i-1] are likely to be independent pointers, getUnderlyingObjects should not assume that Curr and Prev share the same underlying object in the inner loop. If it did we would try to dependence-analyze Curr and Prev and the analysis of the corresponding SCEVs would fail with non-constant distance. To fix this, the getUnderlyingObjects API is extended with an optional LoopInfo parameter. This is effectively what controls whether we want the above behavior or the original. Currently, I only changed to use this approach for LoopAccessAnalysis. The other testcase is to guard the opposite case where we do want to look through the loop PHI. If we step through an array by incrementing a pointer, the underlying object is the incoming value of the phi as the loop is entered. Fixes rdar://problem/19566729 llvm-svn: 235634
* [NVPTX] run SeparateConstOffsetFromGEP before SLSRJingyue Wu2015-04-231-4/+6
| | | | | | | | | | | | | | | | | | Summary: We pick this order because SeparateConstOffsetFromGEP may create more opportunities for SLSR. Test Plan: reassociate-geps-and-slsr.ll no performance regression on internal benchmarks Reviewers: meheff Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D9230 llvm-svn: 235632
* R600/SI: Add assembler support for all CI and VI VOP1 instructionsTom Stellard2015-04-236-11/+71
| | | | llvm-svn: 235629
* R600/SI: v_mov_fed_b32 does not exist on VITom Stellard2015-04-231-1/+1
| | | | llvm-svn: 235628
* R600/SI: Use a better error message for unsupported instructions in the ↵Tom Stellard2015-04-231-1/+1
| | | | | | assembler llvm-svn: 235627
* R600/SI: Improve AsmParser support for forced e64 encodingTom Stellard2015-04-231-5/+45
| | | | | | | We can now force e64 encoding even when the operands would be legal for e32 encoding. llvm-svn: 235626
* [WinEH] Handle stubs for outlined functions that have only unreached ↵Andrew Kaylor2015-04-231-9/+16
| | | | | | terminators. llvm-svn: 235618
* Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare ↵Reid Kleckner2015-04-234-2/+121
| | | | | | | | | | works" We still have some "uses remain after removal" issues in -O0 builds. This reverts commit r235557. llvm-svn: 235617
* [PowerPC] Enable printing instructions using aliasesHal Finkel2015-04-232-2/+8
| | | | | | | | | | | TableGen had been nicely generating code to print a number of instructions using shorter aliases (and PowerPC has plenty of short mnemonics), but we were not calling it. For some of the aliases we support in the parser, TableGen can't infer the "inverse" alias relationship, so there is still more to do. Thus, after some hours of updating test cases... llvm-svn: 235616
* Move DIContext.h to common DebugInfo location.Zachary Turner2015-04-234-22/+5
| | | | | | | | | | This will enable us to create a PDBContext so as to expose some amount of debug info functionality through a common interace. Differential Revision: http://reviews.llvm.org/D9205 Reviewed by: Alexey Samsonov llvm-svn: 235612
* Move Value.isDereferenceablePointer to ValueTracking [NFC]Philip Reames2015-04-237-141/+152
| | | | | | | | | | | Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking. This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality. Patch by: Artur Pilipenko <apilipenko@azulsystems.com> Differential Revision: reviews.llvm.org/D9075 llvm-svn: 235611
* [AArch64] Add nvcast patterns for v4f16 and v8f16Pirama Arumuga Nainar2015-04-231-0/+8
| | | | | | | | | | | | | | | | | | Summary: Constant stores of f16 vectors can create NvCast nodes from various operand types to v4f16 or v8f16 depending on patterns in the stored constants. This patch adds nvcast rules with v4f16 and v8f16 values. AArchISelLowering::LowerBUILD_VECTOR has the details on which constant patterns generate the nvcast nodes. Reviewers: jmolloy, srhines, ab Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9201 llvm-svn: 235610
* [AArch64] Handle vec4, vec8, vec16 *itofp for halfPirama Arumuga Nainar2015-04-231-0/+10
| | | | | | | | | | | | | | | | | | | | | | Summary: Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32, v8i8, v8i16 inputs to allow promotion of v4f16 results. Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32, and i64 vectors. Only missing tests are for v16i8 and v16i16 as the shift operations are too complicated to write a proper check sequence. The conversions from v4i64 to v4f16 do not depend on this patch - v4i64 is split and the conversion gets handled while lowering v2i64. I am adding a test here for completeness. Reviewers: aemerson, rengolin, ab, jmolloy, srhines Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9166 llvm-svn: 235609
* Re-commit r235560: Switch lowering: extract jump tables and bit tests before ↵Hans Wennborg2015-04-234-765/+904
| | | | | | | | | | | building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608
* [Hexagon] Shrink-wrap stack frame (Hexagon-specific)Krzysztof Parzyszek2015-04-233-386/+562
| | | | llvm-svn: 235603
* [mips] [IAS] Move NOP emission after pseudo-instruction expansion. NFC.Toma Tabacu2015-04-231-11/+9
| | | | | | As suggested in the review for http://reviews.llvm.org/D8537. llvm-svn: 235601
* Revert r235560; this commit was causing several failed assertions in Debug ↵Aaron Ballman2015-04-234-903/+765
| | | | | | builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597
* Be more strict about the operand for the array type in BitcodeReaderFilipe Cabecinhas2015-04-231-0/+3
| | | | | | | | | | | | Summary: Bug found with AFL fuzz. Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9016 llvm-svn: 235596
* Verify sizes when trying to read a BitcodeAbbrevOpFilipe Cabecinhas2015-04-231-0/+9
| | | | | | | | | | | | | | | | Summary: Make sure the abbrev operands are valid and that we can read/skip them afterwards. Bug found with AFL fuzz. Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9030 llvm-svn: 235595
* [DAGCombiner] Remove extra bitcasts surrounding vector shuffles Simon Pilgrim2015-04-231-0/+45
| | | | | | | | Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) Differential Revision: http://reviews.llvm.org/D9097 llvm-svn: 235578
* Move common loop utility function isInductionPHI into LoopUtils.cppKarthik Bhat2015-04-232-43/+46
| | | | | | | This patch refactors the definition of common utility function "isInductionPHI" to LoopUtils.cpp. This fixes compilation error when configured with -DBUILD_SHARED_LIBS=ON llvm-svn: 235577
* Add support to interchange loops with reductions.Karthik Bhat2015-04-232-80/+227
| | | | | | | This patch enables interchanging of tightly nested loops with reductions. Differential Revision: http://reviews.llvm.org/D8314 llvm-svn: 235571
* [WinEH] Don't skip landing pads that end with an unreachable instruction.Andrew Kaylor2015-04-231-6/+4
| | | | llvm-svn: 235563
* Switch lowering: extract jump tables and bit tests before building binary ↵Hans Wennborg2015-04-224-765/+903
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560
* [InstCombine] Use a more targeted fix instead of r235544David Majnemer2015-04-221-9/+8
| | | | | | | | | Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while taking advantage that the sign bit is not set. We do this optimization to further shrink the mask but shrinking the mask isn't NSW/NUW preserving in this case. llvm-svn: 235558
* [SEH] Remove the old __C_specific_handler code now that WinEHPrepare worksReid Kleckner2015-04-224-121/+2
| | | | | | | | | | This removes the -sehprepare flag and makes __C_specific_handler functions always to use WinEHPrepare. This was tested by building all of chromium_builder_tests and running a few tests that use SEH, but if something breaks, we can revert this. llvm-svn: 235557
* [RuntimeDyld][COFF] Add external symbol resolution support to RuntimeDyldCOFF.Lang Hames2015-04-221-14/+16
| | | | | | Patch by Andy Ayers. Thanks Andy! llvm-svn: 235554
* [Hexagon] Some cleanup of instruction selection codeKrzysztof Parzyszek2015-04-225-800/+709
| | | | llvm-svn: 235552
* [WinEH] Demote values and phis live across exception handlers up frontReid Kleckner2015-04-221-68/+277
| | | | | | | | | | | | | | | | | | In particular, this handles SSA values that are live *out* of a handler. The existing code only handles values that are live *in* to a handler. It also handles phi nodes in the block where normal control should resume after the end of a catch handler. When EH return points have phi nodes, we need to split the return edge. It is impossible for phi elimination to emit copies in the previous block if that block gets outlined. The indirectbr that we leave in the function is only notional, and is eliminated from the MachineFunction CFG early on. Reviewers: majnemer, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D9158 llvm-svn: 235545
* [InstCombine] Clear out nsw/nuw if we modify computation in the chainDavid Majnemer2015-04-221-3/+10
| | | | | | | | | | | | An nsw/nuw operation relies on the values feeding into it to not overflow if 'poison' is not to be produced. This means that optimizations which make modifications to the bottom of a chain (like SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that they will be preserved. This fixes PR23309. llvm-svn: 235544
* [Hexagon] Use A2_tfrsi for constant pool and jump table addressesKrzysztof Parzyszek2015-04-225-78/+151
| | | | llvm-svn: 235535
* Revert "[opaque pointer type] Avoid using PointerType::getElementType for a ↵David Blaikie2015-04-226-51/+32
| | | | | | | | | | | few cases of CallInst" This reverts commit r235458. It looks like this might be breaking something LTO-ish. Looking into it & will recommit with a fix/test case/etc once I've got more to go on. llvm-svn: 235533
* [AArch64] Use MachineRegisterInfo instead of LiveIntervals to calculate ↵Pete Cooper2015-04-221-4/+4
| | | | | | | | | | | | liveness. NFC. The CondOpt pass currently uses LiveIntervals to set the dead flag on a def. This patch uses MachineRegisterInfo::use_empty instead as that is equivalent to the def being dead. This removes an instance of LiveIntervals in the pass manager pipeline and saves 3.8% of compile time on llc conpiled for AArch64. Reviewed by Chad Rosier and Zhaoshi. llvm-svn: 235532
* don't repeat function names in comments; NFCSanjay Patel2015-04-221-38/+31
| | | | llvm-svn: 235531
* [Hexagon] Consider constant-extended offsets to be validKrzysztof Parzyszek2015-04-222-10/+15
| | | | llvm-svn: 235529
* Test commit: fix typo in comment.Luqman Aden2015-04-221-2/+2
| | | | llvm-svn: 235526
* Fix Windows build break: use LLVM_FUNCTION_NAME instead of __func__.Krzysztof Parzyszek2015-04-222-2/+2
| | | | llvm-svn: 235525
* R600: Fix always inline pass breaking noinline functionsMatt Arsenault2015-04-221-2/+3
| | | | | | No test since calls are not actually supported yet. llvm-svn: 235524
* [Hexagon] Overhaul of stack object allocationKrzysztof Parzyszek2015-04-2210-447/+1328
| | | | | | | | - Use static allocation for aligned stack objects. - Simplify dynamic stack object allocation. - Simplify elimination of frame-indices. llvm-svn: 235521
OpenPOWER on IntegriCloud