summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[Constant Hoisting] Lazily compute the idom and cache the result."Juergen Ributzka2014-04-031-43/+4
| | | | | | | This code is no longer usefull, because we only compute and use the IDom once. There is no benefit in caching it anymore. llvm-svn: 205498
* Account for scalarization costs in BasicTTI::getMemoryOpCost for extending ↵Hal Finkel2014-04-031-2/+24
| | | | | | | | | | | | | | | | | | | | vector loads When a vector type legalizes to a larger vector type, and the target does not support the associated extending load (or truncating store), then legalization will scalarize the load (or store) resulting in an associated scalarization cost. BasicTTI::getMemoryOpCost needs to account for this. Between this, and r205487, PowerPC on the P7 with VSX enabled shows: MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup (some of these are new; some of these, such as PAQ8p, just reverse regressions that VSX support would trigger) llvm-svn: 205495
* Revert "Fix a nomenclature error in llvm-nm."Rafael Espindola2014-04-032-11/+3
| | | | | | | | | | | | | | | | This reverts commit r205479. It turns out that nm does use addresses, it is just that every reasonable relocatable ELF object has sections with address 0. I have no idea if those exist in reality, but it at least it shows that llvm-nm should use the name address. The added test was includes an unusual .o file with non 0 section addresses. I created it by hacking ELFObjectWriter.cpp. Really sorry for the churn. llvm-svn: 205493
* [X86] As per suggestion from Craig Topper and Hal Finkel, overrideLang Hames2014-04-022-40/+39
| | | | | | | | | TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather than abusing commuteInstruction. Thanks very much for the suggestion guys! llvm-svn: 205489
* Fix multi-register costs in BasicTTI::getCastInstrCostHal Finkel2014-04-021-1/+2
| | | | | | | | | | | | | For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or promoted). This is not true when the number of values required to legalize the type is changing. For example, <8 x i16> being sign extended by <8 x i32> is not generically cheap on PPC with VSX, even though sign extension to v4i32 is legal, because two output v4i32 values are required compared to the single v8i16 input value, and without custom logic in the target, this conversion will scalarize. llvm-svn: 205487
* [CodeGen] Teach the peephole optimizer to remember (and exploit) all foldingLang Hames2014-04-021-35/+44
| | | | | | | | opportunities in the current basic block, rather than just the last one seen. <rdar://problem/16478629> llvm-svn: 205481
* Fix a nomenclature error in llvm-nm.Rafael Espindola2014-04-022-3/+11
| | | | | | | | | | | What llvm-nm prints depends on the file format. On ELF for example, if the file is relocatable, it prints offsets. If it is not, it prints addresses. Since it doesn't really need to care what it is that it is printing, use the generic term value. Fix or implement getSymbolValue to keep llvm-nm working. llvm-svn: 205479
* [PowerPC] Make PPCTTI::getMemoryOpCost call BasicTTI::getMemoryOpCostHal Finkel2014-04-021-3/+3
| | | | | | | | | | PPCTTI::getMemoryOpCost will now make use of BasicTTI::getMemoryOpCost to calculate the base cost of the memory access, and then adjust on top of that. There is no functionality change from this modification, but it will become important so that PPCTTI can take advantage of scalarization information for which BasicTTI::getMemoryOpCost will account in the near future. llvm-svn: 205476
* Add comments and test case for [DAG] Keep the opaque constant flag when ↵Juergen Ributzka2014-04-021-1/+5
| | | | | | performing unary constant folding operations (r204737). llvm-svn: 205474
* [X86] Make the VFMA*231 variants commutable and relax the alignment restrictionsLang Hames2014-04-022-106/+147
| | | | | | | | | | | on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load from unaligned memory. Testcase to follow, along with related patch. <rdar://problem/16478629> llvm-svn: 205472
* Revert "Reapply "LTO: add API to set strategy for -internalize""Duncan P. N. Exon Smith2014-04-022-43/+15
| | | | | | | | | | | This reverts commit r199244. Conflicts: include/llvm-c/lto.h include/llvm/LTO/LTOCodeGenerator.h lib/LTO/LTOCodeGenerator.cpp llvm-svn: 205471
* Add comments and test case for [X86TTI] Make constant base pointers for ↵Juergen Ributzka2014-04-021-0/+3
| | | | | | GetElementPtr opaque (r204739). llvm-svn: 205468
* ARM: update subtarget information for Windows on ARMSaleem Abdulrasool2014-04-027-16/+86
| | | | | | | Update the subtarget information for Windows on ARM. This enables using the MC layer to target Windows on ARM. llvm-svn: 205459
* Make a few more range-based loops use explicit types.Jim Grosbach2014-04-022-2/+2
| | | | | | No functional change. llvm-svn: 205458
* TargetLibraryInfo: Disable memcpy and memset on R600Tom Stellard2014-04-021-1/+10
| | | | | | There are no implementations of these for R600. llvm-svn: 205455
* Simplify resolveFrameIndex() signature.Jim Grosbach2014-04-029-24/+15
| | | | | | | | Just pass a MachineInstr reference rather than an MBB iterator. Creating a MachineInstr& is the first thing every implementation did anyway. llvm-svn: 205453
* ARM: cortex-m0 doesn't support unaligned memory access.Jim Grosbach2014-04-021-1/+6
| | | | | | | | | | | | Unlike other v6+ processors, cortex-m0 never supports unaligned accesses. From the v6m ARM ARM: "A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned access occurs." rdar://16491560 llvm-svn: 205452
* Make some range based loop types more explicit.Jim Grosbach2014-04-022-6/+6
| | | | | | No functional change, but more readable code. llvm-svn: 205451
* [mips] Add more Octeon cnMips instructionsKai Nacke2014-04-022-10/+43
| | | | | | | | | | | | Adds the instructions ext/ext32/cins/cins32. It also changes pop/dpop to accept the two operand version and adds a simple pattern to generate baddu. Tests for the two operand versions (including baddu/dmul/dpop/pop) and the code generation pattern for baddu are included. Reviewed by: Daniel.Sanders@imgtec.com llvm-svn: 205449
* [C++11,ARM64] Range based for and explicit 'override' in STP cleanup.Jim Grosbach2014-04-021-15/+13
| | | | | | No functional change intended. llvm-svn: 205446
* [C++11,ARM64] Range based for loops in constant promotion.Jim Grosbach2014-04-021-9/+6
| | | | | | No functional change intended. llvm-svn: 205445
* [C++11,ARM64] Range based for loops in load/store pair optimizer.Jim Grosbach2014-04-021-4/+1
| | | | | | No functional change intended. llvm-svn: 205444
* [C++11,ARM64] Range based for loops in target lowering.Jim Grosbach2014-04-021-3/+2
| | | | | | No functional change intended. llvm-svn: 205443
* [C++11,ARM64] Range based for loops in frame lowering.Jim Grosbach2014-04-021-5/+3
| | | | | | No functional change intended. llvm-svn: 205442
* [C++11,ARM64] Range based for loops in pseudo expansion.Jim Grosbach2014-04-021-3/+2
| | | | | | No functional change intended. llvm-svn: 205441
* [C++11,ARM64] Range based for loops for LOHJim Grosbach2014-04-021-34/+21
| | | | | | No functional change intended. llvm-svn: 205440
* [C++11,ARM64] Range based for loops TLS cleanup.Jim Grosbach2014-04-021-3/+2
| | | | | | No functional change intended. llvm-svn: 205439
* [C++11,ARM64] Range based for loops in branch relaxation.Jim Grosbach2014-04-021-6/+5
| | | | | | No functional change intended. llvm-svn: 205438
* [C++11,ARM64] Range based for loops in address type promotion.Jim Grosbach2014-04-021-34/+25
| | | | | | No functional change intended. llvm-svn: 205437
* [ARM64][CollectLOH] Remove the link to the radar from the comments.Quentin Colombet2014-04-021-3/+0
| | | | llvm-svn: 205435
* ARM: Add support for segmented stacksOliver Stannard2014-04-026-0/+381
| | | | | | Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430
* clarify commentAdrian Prantl2014-04-021-1/+2
| | | | llvm-svn: 205429
* ARM64: use GOT for weak symbols & PIC.Tim Northover2014-04-021-6/+23
| | | | | | | | | | Weak symbols cannot use the small code model's usual ADRP sequences since the instruction simply may not be able to encode a value of 0. This redirects them to use the GOT, which hopefully linkers are able to cope with even in the static relocation model. llvm-svn: 205426
* ARM64: fix lowering of fp128 fptosi/fptouiTim Northover2014-04-021-1/+6
| | | | | | | We were creating libcall nodes that returned an MVT::f128, when these particular operations actually return an int of some stripe. llvm-svn: 205425
* SLPVectorizer: compare entire intrinsic for SLP compatibility.Tim Northover2014-04-021-2/+2
| | | | | | | | | Some Intrinsics are overloaded to the extent that return type equality (all that's been checked up to now) does not guarantee that the arguments are the same. In these cases SLP vectorizer should not recurse into the operands, which can be achieved by comparing them as "Function *" rather than simply the ID. llvm-svn: 205424
* ARM64: make sure first argument to INSERT_SUBVECTOR has right type.Tim Northover2014-04-021-1/+1
| | | | | | | | Again, coalescing and other optimisations swiftly made the MachineInstrs consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was produced. llvm-svn: 205423
* ARM64: convert fp16 narrowing ISel to pseudo-instructionTim Northover2014-04-023-13/+14
| | | | | | | | The previous attempt was fine with optimisations, but was actually rather cavalier with its types. When compiled at -O0, it produced invalid COPY MachineInstrs. llvm-svn: 205422
* Mark FPB as a reserved register when needed.Job Noorman2014-04-021-1/+3
| | | | llvm-svn: 205421
* Work around gold bug http://sourceware.org/PR16794.Rafael Espindola2014-04-021-0/+5
| | | | llvm-svn: 205416
* Remove duplicated DMB instructionsRenato Golin2014-04-024-0/+104
| | | | | | | | | ARM specific optimiztion, finding places in ARM machine code where 2 dmbs follow one another, and eliminating one of them. Patch by Reinoud Elhorst. llvm-svn: 205409
* Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU()Yaron Keren2014-04-022-10/+24
| | | | | | | | | and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the four Windows environments in Triple.h. Suggestion by Saleem Abdulrasool! llvm-svn: 205393
* [LoopVectorizer] Count dependencies of consecutive pointers as uniformsHal Finkel2014-04-021-0/+10
| | | | | | | | | | | | | | | | | | | | | For the purpose of calculating the cost of the loop at various vectorization factors, we need to count dependencies of consecutive pointers as uniforms (which means that the VF = 1 cost is used for all overall VF values). For example, the TSVC benchmark function s173 has: ... %3 = add nsw i64 %indvars.iv, 16000 %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3 ... and we must realize that the add will be a scalar in order to correctly deduce it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all dependencies of a consecutive pointer must be a scalar (uniform), and so we simply need to add all consecutive pointers to the worklist that currently detects collects uniforms. Fixes PR19296. llvm-svn: 205387
* Adjust comments regarding non-relocated abbrev offset in debug_info.dwoDavid Blaikie2014-04-022-2/+4
| | | | | | | | I'm not sure the comment in the implementation really adds a lot of value (it's clear that we emit zero when no symbol is provided, but it doesn't explain why we would do that). Happy to iterate. llvm-svn: 205386
* Split debug_loc and debug_loc.dwo emission into two separate functionsDavid Blaikie2014-04-022-21/+32
| | | | | | Based on code review feedback from Eric Christopher on r204697 llvm-svn: 205385
* DebugInfo: Introduce DebugLocList to encapsulate a list of DebugLocEntries ↵David Blaikie2014-04-025-12/+39
| | | | | | | | | | | | and an MC Label to refer to them This removes the magic-number-esque code creating/retrieving the same label for a debug_loc entry from two places and removes the last small piece of reusable logic from emitDebugLoc so that there will be less duplication when refactoring it into two functions (one for debug_loc, the other for debug_loc.dwo). llvm-svn: 205382
* [ARM64][CollectLOH] Add some comments to explain how the LOHsQuentin Colombet2014-04-022-1/+60
| | | | | | | framework works (for the compiler part), since the design document is not available. llvm-svn: 205379
* Add a doxygen comment to DebugLocEntry::Merge.Adrian Prantl2014-04-011-0/+3
| | | | llvm-svn: 205374
* DebugLocEntry: Actually merge the loc entry when returning true.David Blaikie2014-04-011-1/+5
| | | | | | | | | | Seems we didn't have any test coverage for merging... awesome. So I added some - but hit an llvm-objdump bug while I was there. I'm choosing not to shave that yak right now. Code review feedback/bug catch by Adrian Prantl in r205360. llvm-svn: 205373
* Fix accidental fallthrough in DebugLocEntry::hasSameValueOrLocationDavid Blaikie2014-04-011-5/+10
| | | | | | | | | | No test case (this would invoke UB by examining uninitialized members, etc, at best - and this code is apparently untested anyway - I'm about to fix that) Code review feedback from Adrian Prantl on r205360. llvm-svn: 205367
* Remove unused function DebugLocEntry::isEmptyDavid Blaikie2014-04-011-3/+0
| | | | llvm-svn: 205365
OpenPOWER on IntegriCloud