summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Revert part of GCC warning fix to fix debug build.Matt Arsenault2013-12-051-0/+1
| | | | | | | The typedef is used inside the DEBUG(), and apparently can't be moved inside of it. llvm-svn: 196528
* Fix minor GCC warnings.Matt Arsenault2013-12-051-1/+0
| | | | | | Unused typedefs and unused variables. llvm-svn: 196526
* Change std::deque => std::vector. No functionality change.Michael Gottesman2013-12-051-6/+6
| | | | | | | | There is no reason to use std::deque here over std::vector. Thus given the performance differences inbetween the two it makes sense to change deque to vector. llvm-svn: 196524
* Fix non-deterministic behavior.Rafael Espindola2013-12-051-1/+1
| | | | | | | | | | We use CSEBlocks to initialize a worklist: SmallVector<BasicBlock *, 8> CSEWorkList(CSEBlocks.begin(), CSEBlocks.end()); so it must have a deterministic order. llvm-svn: 196520
* Rename DwarfUnits to DwarfFile to help avoid some naming confusion.Eric Christopher2013-12-056-34/+35
| | | | llvm-svn: 196519
* MI-Sched: Model "reserved" processor resources.Andrew Trick2013-12-052-20/+81
| | | | | | | | | | | | | | | | | | | This allows a target to use MI-Sched as an in-order scheduler that will model strict resource conflicts without defining a processor itinerary. Instead, the target can now use the new per-operand machine model and define in-order resources with BufferSize=0. For example, this would allow restricting the type of operations that can be formed into a dispatch group. (Normally NumMicroOps is sufficient to enforce dispatch groups). If the intent is to model latency in in-order pipeline, as opposed to resource conflicts, then a resource with BufferSize=1 should be defined instead. This feature is only casually tested as there are no in-tree targets using it yet. However, Hal will be experimenting with POWER7. llvm-svn: 196517
* MI-Sched: handle latency of in-order operations with the new machine model.Andrew Trick2013-12-053-6/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The per-operand machine model allows the target to define "unbuffered" processor resources. This change is a quick, cheap way to model stalls caused by the latency of operations that use such resources. This only applies when the processor's micro-op buffer size is non-zero (Out-of-Order). We can't precisely model in-order stalls during out-of-order execution, but this is an easy and effective heuristic. It benefits cortex-a9 scheduling when using the new machine model, which is not yet on by default. MI-Sched for armv7 was evaluated on Swift (and only not enabled because of a performance bug related to predication). However, we never evaluated Cortex-A9 performance on MI-Sched in its current form. This change adds MI-Sched functionality to reach performance goals on A9. The only remaining change is to allow MI-Sched to run as a PostRA pass. I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7: -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results: (min run time over 2 runs, filtering tiny changes) Speedups: | Benchmarks/BenchmarkGame/recursive | 52.39% | | Benchmarks/VersaBench/beamformer | 20.80% | | Benchmarks/Misc/pi | 19.97% | | Benchmarks/Misc/mandel-2 | 19.95% | | SPEC/CFP2000/188.ammp | 18.72% | | Benchmarks/McCat/08-main/main | 18.58% | | Benchmarks/Misc-C++/Large/sphereflake | 18.46% | | Benchmarks/Olden/power | 17.11% | | Benchmarks/Misc-C++/mandel-text | 16.47% | | Benchmarks/Misc/oourafft | 15.94% | | Benchmarks/Misc/flops-7 | 14.99% | | Benchmarks/FreeBench/distray | 14.26% | | SPEC/CFP2006/470.lbm | 14.00% | | mediabench/mpeg2/mpeg2dec/mpeg2decode | 12.28% | | Benchmarks/SmallPT/smallpt | 10.36% | | Benchmarks/Misc-C++/Large/ray | 8.97% | | Benchmarks/Misc/fp-convert | 8.75% | | Benchmarks/Olden/perimeter | 7.10% | | Benchmarks/Bullet/bullet | 7.03% | | Benchmarks/Misc/mandel | 6.75% | | Benchmarks/Olden/voronoi | 6.26% | | Benchmarks/Misc/flops-8 | 5.77% | | Benchmarks/Misc/matmul_f64_4x4 | 5.19% | | Benchmarks/MiBench/security-rijndael | 5.15% | | Benchmarks/Misc/flops-6 | 5.10% | | Benchmarks/Olden/tsp | 4.46% | | Benchmarks/MiBench/consumer-lame | 4.28% | | Benchmarks/Misc/flops-5 | 4.27% | | Benchmarks/mafft/pairlocalalign | 4.19% | | Benchmarks/Misc/himenobmtxpa | 4.07% | | Benchmarks/Misc/lowercase | 4.06% | | SPEC/CFP2006/433.milc | 3.99% | | Benchmarks/tramp3d-v4 | 3.79% | | Benchmarks/FreeBench/pifft | 3.66% | | Benchmarks/Ptrdist/ks | 3.21% | | Benchmarks/Adobe-C++/loop_unroll | 3.12% | | SPEC/CINT2000/175.vpr | 3.12% | | Benchmarks/nbench | 2.98% | | SPEC/CFP2000/183.equake | 2.91% | | Benchmarks/Misc/perlin | 2.85% | | Benchmarks/Misc/flops-1 | 2.82% | | Benchmarks/Misc-C++-EH/spirit | 2.80% | | Benchmarks/Misc/flops-2 | 2.77% | | Benchmarks/NPB-serial/is | 2.42% | | Benchmarks/ASC_Sequoia/CrystalMk | 2.33% | | Benchmarks/BenchmarkGame/n-body | 2.28% | | Benchmarks/SciMark2-C/scimark2 | 2.27% | | Benchmarks/Olden/bh | 2.03% | | skidmarks10/skidmarks | 1.81% | | Benchmarks/Misc/flops | 1.72% | Slowdowns: | Benchmarks/llubenchmark/llu | -14.14% | | Benchmarks/Polybench/stencils/seidel-2d | -5.67% | | Benchmarks/Adobe-C++/functionobjects | -5.25% | | Benchmarks/Misc-C++/oopack_v1p8 | -5.00% | | Benchmarks/Shootout/hash | -2.35% | | Benchmarks/Prolangs-C++/ocean | -2.01% | | Benchmarks/Polybench/medley/floyd-warshall | -1.98% | | Polybench/linear-algebra/kernels/3mm | -1.95% | | Benchmarks/McCat/09-vor/vor | -1.68% | llvm-svn: 196516
* Fix the A9 machine model. VTRN writes two registers.Andrew Trick2013-12-051-1/+1
| | | | llvm-svn: 196514
* comment typo and reformatAndrew Trick2013-12-051-6/+6
| | | | llvm-svn: 196513
* Add a default constructor to get deterministic behavior.Rafael Espindola2013-12-051-0/+1
| | | | | | Should fix the msan and valgrind bots. llvm-svn: 196509
* SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external useArnold Schwaighofer2013-12-051-5/+1
| | | | | | | | | | | | | | | We were creating external uses for scalar values in MustGather entries that also had a ScalarToTreeEntry (they also are present in a vectorized tuple). This meant we would keep a value 'alive' as a scalar and vectorized causing havoc. This is not necessary because when we create a MustGather vector we explicitly create external uses entries for the insertelement instructions of the MustGather vector elements. Fixes PR18129. radar://15582184 llvm-svn: 196508
* [tsan] fix PR18146: sometimes a variable written into vptr could have an ↵Kostya Serebryany2013-12-051-1/+3
| | | | | | integer type (after other optimizations) llvm-svn: 196507
* [NVPTX] Fix off-by-one error when creating the VT list for an SDNodeJustin Holewinski2013-12-051-1/+1
| | | | llvm-svn: 196503
* [mips] Small code generation improvement for conditional operator (select)Matheus Almeida2013-12-051-0/+33
| | | | | | | | | | | | | in case the operands are constants and its difference is |1|. It should be possible in those cases to rematerialize the result using MIPS's slt and similar instructions. The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed otherwise the optimization implemented in this patch would have been triggered (difference between the operands was 1) and that would have changed the semantic of the tests. llvm-svn: 196498
* [mips] Add some comments related to the optimization performed in ↵Matheus Almeida2013-12-051-8/+21
| | | | | | | | | | performSELECTCombine. The structure of the code was slightly modified so that the next patch is easier to read/review. No functional changes. llvm-svn: 196496
* [mips][msa] Fix issue with immediate fields of LD/ST instructionsMatheus Almeida2013-12-055-5/+95
| | | | | | | | | not being correctly encoded/decoded. In more detail, immediate fields of LD/ST instructions should be divided/multiplied by the size of the data format before encoding and after decoding, respectively. llvm-svn: 196494
* ARM: fix yet another stack-folding bugTim Northover2013-12-051-6/+1
| | | | | | | | | | | We were trying to fold the stack adjustment into the wrong instruction in the situation where the entire basic-block was epilogue code. Really, it can only ever be valid to do the folding precisely where the "add sp, ..." would be placed so there's no need for a separate iterator to track that. Should fix PR18136. llvm-svn: 196493
* DwarfDebug/DwarfUnit: Push abbreviation structures down into DwarfUnits to ↵David Blaikie2013-12-052-49/+21
| | | | | | reduce duplication llvm-svn: 196479
* Use isIntrinsic() instead of checking for "llvm."Matt Arsenault2013-12-051-1/+1
| | | | llvm-svn: 196473
* Remove the isImplicitlyPrivate argument of getNameWithPrefix.Rafael Espindola2013-12-057-11/+18
| | | | | | | | | | | | getSymbolWithGlobalValueBase use is to create a name of a new symbol based on the name of an existing GV. Assert that and then remove the last call to pass true to isImplicitlyPrivate. This gives the mangler API a 1:1 mapping from GV to names, which is what we need to drop the mangler dependency on the target (and use an extended datalayout instead). llvm-svn: 196472
* Correct word hyphenationsAlp Toker2013-12-0547-67/+67
| | | | | | | This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471
* Hide the stub created for MO_ExternalSymbol too.Rafael Espindola2013-12-051-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | given declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1) declare void @foo() define void @bar() { call void @foo() call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false) ret void } We used to produce L_foo$stub: .indirect_symbol _foo .ascii "\364\364\364\364\364" _memset$stub: .indirect_symbol _memset .ascii "\364\364\364\364\364" We not produce a private stub for memset too. Stubs are not needed with recent linkers, but we still produce them for darwin8. Thanks to David Fang for confirming that gcc used to do this too. llvm-svn: 196468
* R600/SI: Add comments for number of used registers.Matt Arsenault2013-12-052-14/+56
| | | | llvm-svn: 196467
* Try harder to get a consistent floating point results.Rafael Espindola2013-12-051-1/+1
| | | | | | | | | This just extends the existing hack. It should be enough to get a reproducible bootstrap on 32 bits. I will open a bug to track getting a real fix for this. llvm-svn: 196462
* For AArch64, add missing register cost calculation for big value types like ↵Jiangning Liu2013-12-052-1/+28
| | | | | | v4i64 and v8i64. llvm-svn: 196456
* DwarfDebug: Avoid unnecessary abbreviation lookup when emitting DIEsDavid Blaikie2013-12-052-18/+16
| | | | | | | DIEs already contain references directly to their DIEAbbrev, use that instead of looking it up based on index. llvm-svn: 196446
* DwarfDebug: Remove trivial function wrapperDavid Blaikie2013-12-052-9/+2
| | | | llvm-svn: 196445
* 80-column.Eric Christopher2013-12-051-1/+2
| | | | llvm-svn: 196442
* Remove special handling for DW_AT_ranges support by constructing theEric Christopher2013-12-051-20/+12
| | | | | | values with the correct behavior. llvm-svn: 196441
* [mc] Fix ELF st_other flag.Logan Chien2013-12-052-5/+4
| | | | | | | | | | | | ELF_Other_Weakref and ELF_Other_ThumbFunc seems to be LLVM internal ELF symbol flags. These should not be emitted to object file. This commit defines ELF_STO_Shift for the target-defined flags for st_other, and increase the value of ELF_Other_Shift to 16. llvm-svn: 196440
* Fix comment.Eric Christopher2013-12-051-2/+2
| | | | llvm-svn: 196437
* Add AVX512 patterns for v16i32 broadcast and v2i64 zero extend load.Cameron McInally2013-12-051-0/+4
| | | | | | Patch by Aleksey Bader. llvm-svn: 196435
* Fix typo.Eric Christopher2013-12-041-1/+1
| | | | llvm-svn: 196434
* DwarfUnit: Correct comment by generalizing over all units, not just ↵David Blaikie2013-12-041-3/+3
| | | | | | | | compilation units. Code review feedback on r196394 by Paul Robinson. llvm-svn: 196433
* Fix a bug in darwin's 32-bit X86 handling of evaluating fixups. Kevin Enderby2013-12-041-1/+4
| | | | | | | | | | | | | | | Where it would use a scattered relocation entry but falls back to a normal relocation entry because the FixupOffset is more than 24-bits. The bug is in the X86MachObjectWriter::RecordScatteredRelocation() where it changes reference parameter FixedValue but then returns false to indicate it did not create a scattered relocation entry. The fix is simply to save the original value of the parameter FixedValue at the start of the method and restore it if we are returning false in that case. rdar://15526046 llvm-svn: 196432
* Update comment.Eric Christopher2013-12-041-1/+2
| | | | llvm-svn: 196431
* Update comment.Eric Christopher2013-12-041-1/+1
| | | | llvm-svn: 196430
* Remove incorrect comment and pointless cast.Eric Christopher2013-12-041-2/+1
| | | | llvm-svn: 196427
* const on its own line is confusing.Eric Christopher2013-12-041-2/+2
| | | | llvm-svn: 196426
* Add support for parsing ARM symbol variants on ELF targetsDavid Peixotto2013-12-048-44/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARM symbol variants are written with parens instead of @ like this: .word __GLOBAL_I_a(target1) This commit adds support for parsing these symbol variants in expressions. We introduce a new flag to MCAsmInfo that indicates the parser should use parens to parse the symbol variant. The expression parser is modified to look for symbol variants using parens instead of @ when the corresponding MCAsmInfo flag is true. The MCAsmInfo parens flag is enabled only for ARM on ELF. By adding this flag to MCAsmInfo, we are able to get rid of redundant ARM-specific symbol variants and use the generic variants instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new UseParensForSymbolVariant attribute in MCAsmInfo to correctly print the symbol variants for arm. To achive this we need to keep a handle to the MCAsmInfo in the MCSymbolRefExpr class that we can check when printing the symbol variant. Updated Tests: Changed case of symbol variant to match the generic kind. test/CodeGen/ARM/tls-models.ll test/CodeGen/ARM/tls1.ll test/CodeGen/ARM/tls2.ll test/CodeGen/Thumb2/tls1.ll test/CodeGen/Thumb2/tls2.ll PR18080 llvm-svn: 196424
* Simplify check.Eric Christopher2013-12-041-1/+1
| | | | llvm-svn: 196422
* Reformat slightly.Eric Christopher2013-12-041-27/+45
| | | | llvm-svn: 196421
* Make RangeSpanList take a symbol for the beginning of the rangeEric Christopher2013-12-042-9/+8
| | | | | | rather than magically making the names match. llvm-svn: 196419
* DwarfDebug: Unconditionalize trivial asm commentsDavid Blaikie2013-12-041-10/+5
| | | | | | | | While we still have a few (~4) non-trivial comments with string concatenation, etc that should remain conditionalized, these trivial literal comments can be simplified. llvm-svn: 196416
* DwarfDebug: Reduce code duplication for sec offset emissionDavid Blaikie2013-12-042-56/+32
| | | | llvm-svn: 196414
* Couple of small logical cleanups to use !empty rather than otherEric Christopher2013-12-041-2/+2
| | | | | | checks. No functional change. llvm-svn: 196412
* llvm-cov: Replace size() with empty() in bool check.Yuchen Wu2013-12-041-2/+2
| | | | llvm-svn: 196400
* Use move and stack allocation for RangeSpanLists. As a result makeEric Christopher2013-12-043-21/+17
| | | | | | | a few things more const as well because we're now using const references to refer to iterators. llvm-svn: 196398
* DebugInfo: Remove unused start/end labels for the debug_abbrevs sectionDavid Blaikie2013-12-042-8/+4
| | | | | | | | | | Since we always emit only one abbrevation section (shared by all the compilation units in this module) there's no need for a separate label at the start of each one (and we weren't using the CU ID anyway, so there really was only one label). Use the section label instead and drop the wholely unused debug_abbrev_end label. llvm-svn: 196394
* Fix assembly syntax for AVX512 vector blend instructions.Cameron McInally2013-12-041-2/+2
| | | | llvm-svn: 196393
OpenPOWER on IntegriCloud