summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* NVPTX: Remove unused globals.Benjamin Kramer2013-10-271-7/+3
| | | | llvm-svn: 193500
* Hexagon: Remove global state.Benjamin Kramer2013-10-271-10/+25
| | | | llvm-svn: 193499
* AVX-512: PMIN/PMAX intrinsics and patternsElena Demikhovsky2013-10-272-1/+45
| | | | | | Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193497
* Revert r193251 : Use address-taken to disambiguate global variable and ↵Shuxin Yang2013-10-279-27/+2
| | | | | | indirect memops. llvm-svn: 193489
* Quick look-up for block in loop.Wan Xiaofei2013-10-263-33/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460
* Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop.Andrew Trick2013-10-253-14/+48
| | | | | | | | | | | | Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438
* Fix LSR: don't normalize quadratic recurrences.Andrew Trick2013-10-251-5/+13
| | | | | | | | | | Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) ScalarEvolutionNormalization was attempting to normalize by adding and subtracting strides. Chained recurrences don't work that way. llvm-svn: 193437
* Handle calls and invokes in GlobalStatus.Rafael Espindola2013-10-251-0/+5
| | | | | | | | | | | This patch teaches GlobalStatus to analyze a call that uses the global value as a callee, not as an argument. With this change internalize call handle the common use of linkonce_odr functions. This reduces the number of linkonce_odr functions in a LTO build of clang (checked with the emit-llvm gold plugin option) from 1730 to 60. llvm-svn: 193436
* LoopVectorizer: Don't attempt to vectorize extractelement instructionsHal Finkel2013-10-251-2/+3
| | | | | | | | | | | | | | | The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. llvm-svn: 193434
* DIEHash: Summary hashing of member functionsDavid Blaikie2013-10-251-1/+1
| | | | llvm-svn: 193432
* Change MemoryBuffer::getFile to take a Twine.Rafael Espindola2013-10-251-8/+12
| | | | llvm-svn: 193429
* DIEHash: Summary hashing of nested typesDavid Blaikie2013-10-252-1/+26
| | | | llvm-svn: 193427
* [X86][AVX512] Add patterns that match the AVX512 floating point register ↵Quentin Colombet2013-10-251-0/+5
| | | | | | | | vbroadcast intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193422
* [X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast ↵Quentin Colombet2013-10-251-0/+5
| | | | | | | | intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193421
* Call destroy from ~BasicCallGraph.Rafael Espindola2013-10-251-0/+4
| | | | | | | | | | | | This fix a memory leak found by valgrind. Calling it from the base class destructor would not destroy the BasicCallGraph bits. FIXME: BasicCallGraph is the only thing that inherits from CallGraph. Can we merge the two? llvm-svn: 193412
* ARM: allow .thumb_func to be separated from symbol definitionTim Northover2013-10-252-17/+20
| | | | | | | | | | When assembling, a .thumb_func directive is supposed to be applicable to the next symbol definition, even if there are intervening directives. We were racing ahead to try and find it, and this commit should fix the issue. Patch by Gabor Ballabas llvm-svn: 193403
* The FIXME was indeed fixed in the linker, comment removed.Yaron Keren2013-10-251-4/+0
| | | | llvm-svn: 193402
* ARM: don't expand atomicrmw inline on Cortex-M0Tim Northover2013-10-252-9/+13
| | | | | | | | | | There's a barrier instruction so that should still be used, but most actual atomic operations are going to need a platform decision on the correct behaviour (either nop if single-threaded or OS-support otherwise). rdar://problem/15287210 llvm-svn: 193399
* LegalizeDAG: allow libcalls for max/min atomic operationsTim Northover2013-10-252-0/+60
| | | | | | | | | | | ARM processors without ldrex/strex need to be able to make libcalls for all atomic operations, including the newer min/max versions. The alternative would probably be expanding these operations in terms of cmpxchg (as x86 does always), but in the configurations where this matters code-size tends to be paramount so the libcall is more desirable. llvm-svn: 193398
* Optimize concat_vectors(X, undef) -> scalar_to_vector(X).Nadav Rotem2013-10-253-48/+33
| | | | | | | This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393
* llvm-cov dump to dbgs() instead of outs().Yuchen Wu2013-10-251-13/+14
| | | | llvm-svn: 193390
* Support for reading program counts in llvm-cov.Yuchen Wu2013-10-251-12/+17
| | | | | | | | | | | | | llvm-cov will now be able to read program counts from the GCDA file and output it in the same format as gcov. The program summary tag was identified from gcov-io.h as "\0\0\0\a3". There is currently a bug in GCOVProfiling.cpp which does not generate the run- or program-counting IR, so this change was tested manually by modifying the GCDA file and comparing the gcov and llvm-cov outputs. llvm-svn: 193389
* ARM: Tweak usage of '*vfp' compiler_rt functions.Jim Grosbach2013-10-241-1/+2
| | | | | | | | | Only use them if the subtarget has ARM mode, as these routines are implemented as ARM code. rdar://15302004 llvm-svn: 193381
* MCStreamer: Reimplement the virtual EmitRawText as a protected member, ↵David Blaikie2013-10-243-6/+5
| | | | | | | | | | EmitRawTextImpl, to avoid string literal ambiguities Also improve the implementation of EmitRawText(Twine) so it doesn't bother using the SmallString buffer if the Twine is a simple StringRef anyway. llvm-svn: 193378
* DWARF emission: Remove unnecessary/redundant DIE reference codeDavid Blaikie2013-10-241-7/+0
| | | | | | The default case at the end of the switch handles this just fine. llvm-svn: 193374
* Fix name of variable in comment.Eric Christopher2013-10-241-1/+1
| | | | llvm-svn: 193373
* Grammar.Eric Christopher2013-10-241-1/+1
| | | | llvm-svn: 193372
* Update misleading comment.Eric Christopher2013-10-241-2/+3
| | | | llvm-svn: 193371
* DIEHash: Const correct and use references where non-null/non-rebound.David Blaikie2013-10-243-49/+49
| | | | llvm-svn: 193363
* DIEHash: Do not use shallow type hashing for unnamed typesDavid Blaikie2013-10-241-4/+6
| | | | llvm-svn: 193361
* DIEHash: Refactor ref attribute hashing into smaller functionsDavid Blaikie2013-10-243-68/+98
| | | | llvm-svn: 193360
* Remove unused debug-only member variable.David Blaikie2013-10-241-4/+0
| | | | | | | This may've been used at some point but the 'print' member function grew an Indent parameter that entirely shadows this parameter. llvm-svn: 193358
* Remove class abstraction from ARM struct byval loweringDavid Peixotto2013-10-241-553/+262
| | | | | | | | | | | This commit changes the struct byval lowering for arm to use inline checks for the subtarget instead of a class abstraction to represent the differences. The class abstraction was judged to be too much code for this task. No intended functionality change. llvm-svn: 193357
* Inliner: Handle readonly attribute per argument when adding memcpyTom Stellard2013-10-241-10/+13
| | | | | | Patch by: Vincent Lejeune llvm-svn: 193356
* ARM: Mark double-precision instructions as suchTim Northover2013-10-243-45/+66
| | | | | | | | | | | | This prevents us from silently accepting invalid instructions on (for example) Cortex-M4 with just single-precision VFP support. No tests for the extra Pat Requires because they're essentially assertions: the affected code should have been lowered to libcalls before ISel. rdar://problem/15302004 llvm-svn: 193354
* Reverting my r193344 checkin due to build breakage.John Thompson2013-10-241-11/+0
| | | | llvm-svn: 193350
* Mark vector loops as already vectorizedRenato Golin2013-10-241-0/+4
| | | | | | | | Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. llvm-svn: 193349
* Added std::string as a built-in type for mapping.John Thompson2013-10-241-0/+11
| | | | llvm-svn: 193344
* ARM: add a couple more NEON predicates.Tim Northover2013-10-241-4/+4
| | | | | | | | The fused multiply instructions were added in VFPv4 but are still NEON instructions, in particular they shouldn't be available on a Cortex-M4 not matter how floaty it is. llvm-svn: 193342
* ARM: mark various aliases with their architecture requirements.Tim Northover2013-10-242-8/+12
| | | | | | | | | | If an alias inherits directly from InstAlias then it doesn't get any default "Requires" values, so llvm-mc will allow it even on architectures that don't support the underlying instruction. This tidies up the obvious VFP and NEON cases I found. llvm-svn: 193340
* ARM: Use non-VFP softcalls on embedded Darwinish targetsTim Northover2013-10-241-1/+1
| | | | | | | | | | | | | The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1 code to make use of VFP instructions by switching back to ARM mode, they make no sense for M-class processors which don't even have an ARM mode. Given that justification, in practice this is a platform ABI decision so the actual check is based on that rather than CPU features. rdar://problem/15302004 llvm-svn: 193327
* Replaced non-ASCII character.Yaron Keren2013-10-241-1/+1
| | | | llvm-svn: 193324
* Revert part of r193291, restoring the deletion of loaded objects.Chandler Carruth2013-10-241-0/+9
| | | | | | | | | | | Without this, customers of the MCJIT were leaking memory like crazy. It's not really clear what the *right* memory management is here, so I'm not trying to add lots of tests or other logic, just trying to get us back to a better baseline. I'll follow up on the original commit to figure out the right path forward. llvm-svn: 193323
* ARM: fix assert on unpredictable POP instruction.Tim Northover2013-10-241-3/+2
| | | | | | | | | | | POP instructions are aliased to the ARM LDM variants but have different syntax. This caused two problems: we tried to access a non-existent operand to annotate the '!', and the error message didn't make much sense. With some vigorous hand-waving in the error message both problems can be fixed. llvm-svn: 193322
* Make sure SP is always aligned on a 2 byte boundaryJob Noorman2013-10-241-2/+2
| | | | llvm-svn: 193320
* fix PR17635: false positive with packed structuresNuno Lopes2013-10-242-4/+7
| | | | | | LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object llvm-svn: 193317
* [AArch64] Fix NZCV reg live-in bug in F128CSEL codegen.Amara Emerson2013-10-241-2/+6
| | | | | | | | | When generating the IfTrue basic block during the F128CSEL pseudo-instruction handling, the NZCV live-in for the newly created BB wasn't being added. This caused a fault during MI-sched/live range calculation when the predecessor for the fall-through BB didn't have a live-in for phys-reg as expected. llvm-svn: 193316
* AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsicsElena Demikhovsky2013-10-241-0/+36
| | | | llvm-svn: 193312
* Fix a bug in LinearFunctionTestReplace that created invalid loop exit checks.Juergen Ributzka2013-10-241-1/+7
| | | | | | Reviewed by Andy llvm-svn: 193303
* Fixed llvm-cov to count edges instead of blocks.Yuchen Wu2013-10-241-2/+11
| | | | | | | | | | | | | | | | | This was a fundamental flaw in llvm-cov where it treated the values in the GCDA files as block counts instead of edge counts. This created incorrect line counts when branching was present. Instead, the edge counts should be summed to obtain the correct block count. The fix was tested using custom test files as well as single source files from the test-suite directory. The behaviour can be verified by reading the GCOV documentation that describes the GCDA spec ("ARC_COUNTS gives the counter values for those arcs that are instrumented") and the header description provided by GCOVProfiling.cpp ("instruments the code that runs to records (sic) the edges between blocks that run and emit a complementary "gcda" file on exit"). llvm-svn: 193299
OpenPOWER on IntegriCloud