summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* less indent; NFCISanjay Patel2015-11-101-46/+47
| | | | llvm-svn: 252643
* [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()Sanjay Patel2015-11-102-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz r0, r0 bx lr cttz: rbit r0, r0 clz r0, r0 bx lr Instead of: ctlz: cmp r0, #0 moveq r0, #32 clzne r0, r0 bx lr cttz: cmp r0, #0 moveq r0, #32 rbitne r0, r0 clzne r0, r0 bx lr This will help solve a general speculation/despeculation problem noted in PR24818: https://llvm.org/bugs/show_bug.cgi?id=24818 Differential Revision: http://reviews.llvm.org/D14469 llvm-svn: 252639
* LegalizeDAG: Implement promote for scalar_to_vectorMatt Arsenault2015-11-101-0/+28
| | | | | | | | | | | This allows avoiding the default Expand behavior which introduces stack usage. Bitcast the scalar and replace the missing elements with undef. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252632
* LegalizeDAG: Implement promote for insert_vector_eltMatt Arsenault2015-11-101-1/+52
| | | | | | | This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252631
* LegalizeDAG: Implement promote for extract_vector_eltMatt Arsenault2015-11-101-4/+58
| | | | | | | | | | This is for AMDGPU to implement v2i64 extract as extract of half of a v4i32. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252630
* [ValueTracking] Recognize that and(x, add (x, -1)) clears the low bitPhilip Reames2015-11-101-0/+16
| | | | | | | | | | This is a cleaned up version of a patch by John Regehr with permission. Originally found via the souper tool. If we add an odd number to x, then bitwise-and the result with x, we know that the low bit of the result must be zero. Either it was zero in x originally, or the add cleared it in the temporary value. As a result, one of the two values anded together must have the bit cleared. Differential Revision: http://reviews.llvm.org/D14315 llvm-svn: 252629
* [ThinLTO] Update comment per change in WeakAny handling (NFC)Teresa Johnson2015-11-101-1/+3
| | | | llvm-svn: 252627
* [ThinLTO] WeakAny fixes/cleanupTeresa Johnson2015-11-101-17/+18
| | | | | | | | | | | | | | | | | | | | | | | Ensure WeakAny variables are imported as ExternalWeak declarations. To handle WeakAny more consistently and fix this issue: 1) Update helper doImportAsDefinition to properly flag WeakAny variables and aliases as not importing defintions. Update callers of doImportAsDefinition to remove now redundant checks for WeakAny aliases, or ignore aliases, as appropriate. 2) Add any !doImportAsDefinition GVs to DoNotLinkFromSource set during linking of the GV prototype, where we usually add GVs to the DoNotLinkFromSource set for other reasons. Remove now unnecessary adding of WeakAny aliases to DoNotLinkFromSource set from copyGlobalAliasProto. Remove now unnecessary guard against linking non-imported function bodies from ModuleLinker::run. llvm-svn: 252626
* [AArch64] add overrides for isCheapToSpeculateCttz() and ↵Sanjay Patel2015-11-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | isCheapToSpeculateCtlz() AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any AArch64 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz w0, w0 ret cttz: rbit w8, w0 clz w0, w8 ret Instead of: ctlz: cbz w0, .LBB0_2 clz w0, w0 ret .LBB0_2: orr w0, wzr, #0x20 ret cttz: cbz w0, .LBB1_2 rbit w8, w0 clz w0, w8 ret .LBB1_2: orr w0, wzr, #0x20 ret See D14469 for the larger motivation. Differential Revision: http://reviews.llvm.org/D14505 llvm-svn: 252625
* Revert "Strip metadata when speculatively hoisting instructions"Renato Golin2015-11-103-16/+0
| | | | | | | This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as well as some x86, et al. llvm-svn: 252623
* [X86] Do not try to custom-lower sitofp/fptosi in soft-float modeMichael Kuperstein2015-11-101-11/+17
| | | | | | Differential Revision: http://reviews.llvm.org/D14495 llvm-svn: 252621
* Fix asan warning (NFC)Xinliang David Li2015-11-101-2/+3
| | | | llvm-svn: 252617
* add 'MustReduceDepth' as an objective/cost-metric for the MachineCombinerSanjay Patel2015-11-101-29/+53
| | | | | | | | | | | | | | | | | | | | | | This is one of the problems noted in PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016 and: http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html The spilling problem is independent and not addressed by this patch. The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. This is caused by inclusion of the "slack" factor when calculating the critical path of the original code sequence. If we don't add that, then we have a more conservative cost comparison of the old code sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64 MULADD patterns because benchmark regressions were observed without that. The two failing test cases now have identical asm that does what we want: a + b + c + d ---> (a + b) + (c + d) Differential Revision: http://reviews.llvm.org/D13417 llvm-svn: 252616
* Reapply "[ARM] Combine CMOV into BFI where possible"James Molloy2015-11-102-0/+113
| | | | | | | | | | | | | | | | | | Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh! Original commit message: If we have a CMOV, OR and AND combination such as: if (x & CN) y |= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252606
* Strip metadata when speculatively hoisting instructionsIgor Laevsky2015-11-103-0/+16
| | | | | | | | | | | | | | | | This is fix for PR24059. When we are hoisting instruction above some condition it may turn out that metadata on this instruction was control dependant on the condition. This metadata becomes invalid and we need to drop it. This patch should cover most obvious places of speculative execution (which I have found by greping isSafeToSpeculativelyExecute). I think there are more cases but at least this change covers the severe ones. Differential Revision: http://reviews.llvm.org/D14398 llvm-svn: 252604
* [PowerPC] Remove redundant code.Tilmann Scheller2015-11-101-3/+2
| | | | | | | | The local variable Hi is never being read. Issue identified by the Clang static analyzer. llvm-svn: 252600
* [AArch64] Fix halfword load merging for big-endian targetsOliver Stannard2015-11-101-3/+9
| | | | | | | | | | | | For big-endian targets, when we merge two halfword loads into a word load, the order of the halfwords in the loaded value is reversed compared to little-endian, so the load-store optimiser needs to swap the destination registers. This does not affect merging of two word loads, as we use ldp, which treats the memory as two separate 32-bit words. llvm-svn: 252597
* Inliner: Do zero-cost inlines even if above a negative threshold (PR24851)Hans Wennborg2015-11-101-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D14499 llvm-svn: 252595
* AVX512 : Implemented encoding and DAG lowering for VMOVHPS/PD and VMOVLPS/PD ↵Igor Breger2015-11-102-5/+119
| | | | | | | | instructions. Differential Revision: http://reviews.llvm.org/D14492 llvm-svn: 252592
* Remove another variable unused in -Asserts buildDavid Blaikie2015-11-101-2/+2
| | | | llvm-svn: 252582
* Remove some unused variables to clean up the -Werror buildDavid Blaikie2015-11-102-4/+4
| | | | llvm-svn: 252580
* [Hexagon] Adding instruction aliases and tests.Colin LeMahieu2015-11-102-0/+464
| | | | llvm-svn: 252579
* Support for emitting inline stack probesAndy Ayers2015-11-104-30/+323
| | | | | | | | | | | | | | | | | | For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages between the current stack limit and the desired new stack pointer location. This implements support for the inline expansion on x64. For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications that arise when introducing new machine basic blocks during prolog and epilog creation. Added a new test case, modified an existing one to exclude non-x64 coreclr (for now). Add test case Fix tests llvm-svn: 252578
* [Hexagon] Fixing compound register printing and reenabling more tests.Colin LeMahieu2015-11-102-10/+33
| | | | llvm-svn: 252574
* AArch64: add experimental support for address tagging.Tim Northover2015-11-103-5/+64
| | | | | | | | | | | | | AArch64 has the ability to use the top 8-bits of an "address" for extra information, with the memory subsystem automatically masking them off for loads and stores. When that's happening, we can sometimes skip masks on memory operations in the compiler. However, this requires the host OS and support stack to preserve those bits so it can't be enabled everywhere. In principle iOS 8.0 and above do take the required precautions and but we'll put it under a flag for now. llvm-svn: 252573
* Fix llvm-nm(1) printing of llvm-bitcode files for -format darwin to match ↵Kevin Enderby2015-11-101-0/+6
| | | | | | | | darwin’s nm(1). Also a small fix to match printing of Mach-O objects with -format posix. llvm-svn: 252567
* [WebAssembly] Support 'unreachable' expressionDerek Schuff2015-11-103-2/+15
| | | | | | | | | | | | | | | Lower LLVM's 'unreachable' terminator to ISD::TRAP, and lower ISD::TRAP to wasm's 'unreachable' expression. WebAssembly type-checks expressions, but a noreturn function with a return type that doesn't match the context will cause a check failure. So we lower LLVM 'unreachable' to ISD::TRAP and then lower that to WebAssembly's 'unreachable' expression, which typechecks in any context and causes a trap if executed. Differential Revision: http://reviews.llvm.org/D14515 llvm-svn: 252566
* Remove unnecessary call to getAllocatableRegClassMatt Arsenault2015-11-101-4/+8
| | | | | | | | | | | | | | | | | | | | | | I'm not sure what the point of this was. I'm not sure why you would ever define an instruction that produces an unallocatable register class. No tests fail with this removed, and it seems like it should be a verifier error to define such an instruction. This was problematic for AMDGPU because it would make bad decisions by arbitrarily changing the register class when unsetting isAllocatable for VS_32/VS_64, which is currently set as a workaround to this problem. AMDGPU uses the VS_32/VS_64 register classes to represent operands which can use either VGPRs or SGPRs. When isAllocatable is unset for these, this would need to pick either the SGPR or VGPR class and insert either a copy we don't want, or an illegal copy we would need to deal with later. A semi-arbitrary register class ordering decision is made in tablegen, which resulted in always picking a VGPR class because it happens to have more registers than the SGPR register class. We really just want to use whatever register class the original register had. llvm-svn: 252565
* [PGO] Make indexed value profile data more compactXinliang David Li2015-11-103-84/+248
| | | | | | | | | | | | | | | | - Make indexed value profile data more compact by peeling out the per-site value count field into its own smaller sized array. - Introduced formal data structure definitions to specify value profile data layout in indexed format. Previously the layout of the data is only assumed in the client code (scattered in three different places : size computation, EmitData, and ReadData - The new data structure serves as a central place for layout documentation. - Add interfaces to force BE output for value profile data (testing purpose) - Add byte swap unit tests Differential Revision: http://reviews.llvm.org/D14401 llvm-svn: 252563
* [Hexagon] Fixing store instructions and reenabling a few more tests.Colin LeMahieu2015-11-102-25/+20
| | | | llvm-svn: 252561
* [ARM] Handle t2ADDri in ARMAsmPrinter::EmitUnwindingInstruction.Akira Hatanaka2015-11-101-0/+1
| | | | | | | | | | | | | This fixes a bug in ARMAsmPrinter::EmitUnwindingInstruction where llvm_unreachable was reached because t2ADDri wasn't handled. Test case provided by Tim Northover. rdar://problem/23270609 http://reviews.llvm.org/D14518 llvm-svn: 252557
* [Hexagon] Fixing load instruction parsing and reenabling tests.Colin LeMahieu2015-11-104-14/+16
| | | | llvm-svn: 252555
* MachineVerifier: Streamline live interval related error reportingMatthias Braun2015-11-091-90/+93
| | | | | | | | | Simply perform additional report_context() calls after a report() instead of adding more and more overloaded variations of report(). Also improve several instances where information was output in an ad-hoc way probably because no matching report() overload was available. llvm-svn: 252552
* MachineVerifier: Add missing linebreakMatthias Braun2015-11-091-0/+1
| | | | | | | MachineInstr::print() with SkipOppers==true does not produce a linebreak, so we have to do that in MachineVerifier::report(). llvm-svn: 252551
* MachineVerifier: MI::print has no TargetMachine overloadMatthias Braun2015-11-091-1/+1
| | | | | | | The code was passing a target machine pointer which degraded to a true operand to SkipOppers. llvm-svn: 252550
* MachineVerifier: print list of live intervals if availableMatthias Braun2015-11-091-1/+4
| | | | llvm-svn: 252549
* [WinEH] Remove isBarrier from instructions that do not returnReid Kleckner2015-11-091-2/+2
| | | | | | Fixes machine verification failures with David's latest EH change. llvm-svn: 252541
* add a SelectionDAG method to check if no common bits are set in two nodes; NFCISanjay Patel2015-11-093-33/+18
| | | | | | | | | | | | | | | This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. llvm-svn: 252539
* [TargetLibraryInfo] Add support for fls, flsl, flsll.Davide Italiano2015-11-091-0/+10
| | | | | | | | | This is a prerequisite for further optimisations of these functions, which will be commited as a separate patch. Differential Revision: http://reviews.llvm.org/D14219 llvm-svn: 252535
* [libFuzzer] make libFuzzer link if there is no sanitizer coverage ↵Kostya Serebryany2015-11-094-0/+50
| | | | | | instrumentation (it will fail at start-up time) llvm-svn: 252533
* Combine ifdefs around dl_iterate_phdr in Unix/Signals.incReid Kleckner2015-11-091-13/+8
| | | | | | | This avoids the need to have two dummy implementations of findModulesAndOffsets. llvm-svn: 252531
* [WinEH] Don't emit CATCHRET from visitCatchPadDavid Majnemer2015-11-094-22/+32
| | | | | | | Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528
* [x86] try harder to match bitwise 'or' into an LEASanjay Patel2015-11-091-11/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation for this patch starts with the epic fail example in PR18007: https://llvm.org/bugs/show_bug.cgi?id=18007 ...unfortunately, this patch makes no difference for that case, but it solves some simpler cases. We'll get there some day. :) The current 'or' matching code was using computeKnownBits() via isBaseWithConstantOffset() -> MaskedValueIsZero(), but that's an unnecessarily limited use. We can do more by copying the logic in ValueTracking's haveNoCommonBitsSet(), so we can treat the 'or' as if it was an 'add'. There's a TODO comment here because we should lift the bit-checking logic into a helper function, so it's not duplicated in DAGCombiner. An example of the better LEA matching: leal (%rdi,%rdi), %eax andl $1, %esi orl %esi, %eax Becomes: andl $1, %esi leal (%rsi,%rdi,2), %eax Differential Revision: http://reviews.llvm.org/D13956 llvm-svn: 252515
* [Hexagon] Separating statement to match what clang-format would do.Colin LeMahieu2015-11-091-2/+4
| | | | llvm-svn: 252513
* [WinEH] Tweak funclet prologue/epilogue insertion to pass verifierReid Kleckner2015-11-092-6/+12
| | | | | | | | | | For some reason we'd never run MachineVerifier on WinEH code, and you explicitly have to ask for it with llc. I added it to a few test cases to get some coverage. Fixes PR25461. llvm-svn: 252512
* [WinEH] Re-committing r252249 (Clone funclets with multiple parents) with ↵Andrew Kaylor2015-11-091-34/+1005
| | | | | | | | additional fixes for determinism problems Differential Revision: http://reviews.llvm.org/D14454 llvm-svn: 252508
* [Hexagon] Fix -Wmicrosoft-enum-value warning with explicit enum typeReid Kleckner2015-11-091-1/+1
| | | | llvm-svn: 252505
* don't repeat function names in comments; NFCSanjay Patel2015-11-091-19/+17
| | | | llvm-svn: 252502
* Moving FileManager::removeDotPaths to llvm::sys::path::remove_dotsMike Aizatsky2015-11-091-0/+35
| | | | | | Differential Revision: http://reviews.llvm.org/D14393 llvm-svn: 252499
* [sanitizer] Use same shadow offset for ASAN on aarch64Adhemerval Zanella2015-11-091-15/+2
| | | | | | | | | This patch makes ASAN for aarch64 use the same shadow offset for all currently supported VMAs (39 and 42 bits). The shadow offset is the same for 39-bit (36). Similar to ppc64 port, aarch64 transformation also requires to use an add instead of 'or' for 42-bit VMA. llvm-svn: 252495
OpenPOWER on IntegriCloud