summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* llvm-readobj: print out the fields of the COFF delay-import tableRui Ueyama2014-10-031-0/+6
| | | | llvm-svn: 218996
* [Power] Use lwsync for non-seq_cst fencesRobin Morisset2014-10-031-1/+8
| | | | | | | | | | | | | | | | Summary: hwsync is only required for seq_cst fences, acquire and release one can use the cheaper lwsync. Test Plan: Added some cases to atomics.ll + make check-all Reviewers: jfb, wschmidt Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5317 llvm-svn: 218995
* MipsAsmParser.cpp: fix VS2012 buildHans Wennborg2014-10-031-1/+1
| | | | llvm-svn: 218991
* HexagonMCCodeEmitter.h: deleted member functions are not supported in VS2012Hans Wennborg2014-10-031-2/+2
| | | | llvm-svn: 218990
* [mips] Print warning when using register names not available in N32/64Daniel Sanders2014-10-032-0/+34
| | | | | | | | | | | | | | | | | | | Summary: The register names t4-t7 are not available in the N32 and N64 ABIs. This patch prints a warning, when those names are used in N32/64, along with a fix-it with the correct register names. Patch by Vasileios Kalintiris Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5272 llvm-svn: 218989
* Fix build break on HexagonSid Manning2014-10-031-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D5600 llvm-svn: 218987
* Adding skeleton for unit testing Hexagon Code EmissionSid Manning2014-10-036-8/+173
| | | | | | | | | | | Adding and modifying CMakeLists.txt files to run unit tests under unittests/Target/* if the directory exists. Adding basic unit test to check that code emitter object can be retrieved. Differential Revision: http://reviews.llvm.org/D5523 Change by: Colin LeMahieu llvm-svn: 218986
* [x86] Teach the new vector shuffle lowering to aggressively form MOVSSChandler Carruth2014-10-032-5/+37
| | | | | | | | | | | | | | | | | | | | | | | | and MOVSD nodes for single element vector inserts. This is particularly important because a number of patterns in the backend detect these patterns and leverage them to simplify things. It also fixes quite a few of the insertion bad code examples. However, it regresses a specific area: when available, blendps and blendpd are *dramatically* faster than movss and movsd respectively. But it doesn't really work to form the blend logic first because the blends *aren't* as crazy efficient when the data is coming from memory anyways, and thus will have a movss or movsd regardless. Also, doing that would block a bunch of the patterns that this is designed to hit. So my plan is to go into the patterns for lowering MOVSS and MOVSD and lower them via blends when available. However that's a pretty invasive restructuring so it will need to be a follow-up patch. I have already gone into the patterns to lower MOVSS and MOVSD from memory using MOVLPD, etc. Without that, several of the test cases I already have regress. llvm-svn: 218985
* Revert 202433 - Provide a target override for the latest regalloc heuristicRenato Golin2014-10-033-8/+1
| | | | | | | | | | | That commit was introduced in order to help investigate a problem in ARM codegen breaking from commit 202304 (Add a limit to the heuristic that register allocates instructions in local order). Recent analisys indicated that the problem no longer exists, so I'm reverting this change. See PR18996. llvm-svn: 218981
* [x86] Refactor the element insertion logic in the new vector shuffleChandler Carruth2014-10-031-19/+21
| | | | | | | | | | | lowering to handle the potential mirroring of 2-element vectors (because we can't reliably sort them one way) in the caller rather than in the insertion logic. This will simplify things considerably as more ways to fail to match the insertion are added because now we have a nice try and retry point. llvm-svn: 218980
* [x86] Significantly improve the ability of the new vector shuffleChandler Carruth2014-10-031-26/+30
| | | | | | | | | | | | | | | | lowering to match VZEXT_MOVL patterns. I hadn't realized that these had sufficient pattern smarts in the backend to lower zext-ing from the low element of a vector without it being a scalar_to_vector node. They do, and this is how to match a bunch of patterns for movq, movss, etc. There is a weird propensity to end up using pshufd to place the element afterward even though it means domain crossing (or rather, to use xorps+movss to zext the element rather than movq) but that's an orthogonal problem with VZEXT_MOVL that someone should probably look at. llvm-svn: 218977
* [x86] Unbreak SSE1 with the new vector shuffle lowering. We can't widenChandler Carruth2014-10-031-4/+8
| | | | | | | | | element types to form illegal vector types. I've added a special SSE1 test case here that makes sure we don't break this going forward. llvm-svn: 218974
* Revert r215343.James Molloy2014-10-031-25/+1
| | | | | | This was contentious and needs invesigation. llvm-svn: 218971
* [BasicAA] Revert r218714 - Make better use of zext and sign information.Lang Hames2014-10-031-29/+2
| | | | | | | | | This patch broke 447.dealII on Darwin. I'm currently working on a reduced test-case, but reverting for now to keep the bots happy. <rdar://problem/18530107> llvm-svn: 218944
* constify TargetMachine parameter.Eric Christopher2014-10-034-5/+6
| | | | llvm-svn: 218934
* llvm-readobj: print COFF delay-load import tableRui Ueyama2014-10-031-13/+90
| | | | | | | | | This patch adds another iterator to access the delay-load import table and use it from llvm-readobj. http://reviews.llvm.org/D5594 llvm-svn: 218933
* constify TargetMachine argument.Eric Christopher2014-10-034-4/+4
| | | | llvm-svn: 218930
* We can grab the options struct from the TargetMachine, no need toEric Christopher2014-10-033-5/+4
| | | | | | pass it down in the constructor. llvm-svn: 218929
* [AVX512] Pull pattern for subvector insert into the instruction definitionAdam Nemet2014-10-021-8/+4
| | | | | | | | | | No functional change intended. Very similar to the change I made for subvector extract in r218480. test/CodeGen/X86/avx512-insert-extract.ll covers this. llvm-svn: 218928
* [AVX512] Refactor subvector insertsAdam Nemet2014-10-021-102/+55
| | | | | | | | | | No functional change. Very similar to the extract refactoring I did in r218478. Compared X86.td.expanded before and after. llvm-svn: 218927
* [AVX512] Fix i256mem->f256mem typo in VINSERTF64x4rmAdam Nemet2014-10-021-1/+1
| | | | | | | Just like in the case of extracts, the refactoring is uncovering some typos in the code. llvm-svn: 218926
* [PowerPC] Modern Book-E cores support syncHal Finkel2014-10-024-17/+24
| | | | | | | | | | | | | Older Book-E cores, such as the PPC 440, support only msync (which has the same encoding as sync 0), but not any of the other sync forms. Newer Book-E cores, however, do support sync, and for performance reasons we should allow the use of the more-general form. This refactors msync use into its own feature group so that it applies by default only to older Book-E cores (of the relevant cores, we only have definitions for the PPC440/450 currently). llvm-svn: 218923
* [Power] Improve the expansion of atomic loads/storesRobin Morisset2014-10-023-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Atomic loads and store of up to the native size (32 bits, or 64 for PPC64) can be lowered to a simple load or store instruction (as the synchronization is already handled by AtomicExpand, and the atomicity is guaranteed thanks to the alignment requirements of atomic accesses). This is exactly what this patch does. Previously, these were implemented by complex load-linked/store-conditional loops.. an obvious performance problem. For example, this patch turns ``` define void @store_i8_unordered(i8* %mem) { store atomic i8 42, i8* %mem unordered, align 1 ret void } ``` from ``` _store_i8_unordered: ; @store_i8_unordered ; BB#0: rlwinm r2, r3, 3, 27, 28 li r4, 42 xori r5, r2, 24 rlwinm r2, r3, 0, 0, 29 li r3, 255 slw r4, r4, r5 slw r3, r3, r5 and r4, r4, r3 LBB4_1: ; =>This Inner Loop Header: Depth=1 lwarx r5, 0, r2 andc r5, r5, r3 or r5, r4, r5 stwcx. r5, 0, r2 bne cr0, LBB4_1 ; BB#2: blr ``` into ``` _store_i8_unordered: ; @store_i8_unordered ; BB#0: li r2, 42 stb r2, 0(r3) blr ``` which looks like a pretty clear win to me. Test Plan: fixed the tests + new test for indexed accesses + make check-all Reviewers: jfb, wschmidt, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5587 llvm-svn: 218922
* Fix the threshold added in r186434 (a re-apply of r185393) and updaatedChandler Carruth2014-10-022-31/+27
| | | | | | | | | | | | | | | | | to be a ManagedStatic in r218163 to not be a global variable written and read to from within the innards of SpillPlacement. This will fix a really scary race condition for anyone that has two copies of LLVM running spill placement concurrently. Yikes! This will also fix a really significant compile time hit that r218163 caused because the spill placement threshold read is actually in the *very* hot path of this code. The memory fence on each read was showing up as huge compile time regressions when spilling is responsible for most of the compile time. For example, optimizing sanitized code showed over 50% compile time regressions here. =/ llvm-svn: 218921
* [Stackmaps] Make ithe frame-pointer required for stackmaps.Juergen Ributzka2014-10-022-2/+4
| | | | | | | | | Do not eliminate the frame pointer if there is a stackmap or patchpoint in the function. All stackmap references should be FP relative. This fixes PR21107. llvm-svn: 218920
* Revert "DI: Fold constant arguments into a single MDString"Duncan P. N. Exon Smith2014-10-027-528/+608
| | | | | | This reverts commit r218914 while I investigate some bots. llvm-svn: 218918
* llvm-readobj: print COFF imported symbolsRui Ueyama2014-10-021-0/+90
| | | | | | | | This patch defines a new iterator for the imported symbols. Make a change to COFFDumper to use that iterator to print out imported symbols and its ordinals. llvm-svn: 218915
* DI: Fold constant arguments into a single MDStringDuncan P. N. Exon Smith2014-10-027-608/+528
| | | | | | | | | | | | | This patch addresses the first stage of PR17891 by folding constant arguments together into a single MDString. Integers are stringified and a `\0` character is used as a separator. Part of PR17891. Note: I've attached my testcases upgrade scripts to the PR. If I've just broken your out-of-tree testcases, they might help. llvm-svn: 218914
* [x86] Teach the new vector shuffle lowering to widen floating pointChandler Carruth2014-10-022-8/+19
| | | | | | | | | | | | | | | | | | | elements as well as integer elements in order to form simpler shuffle patterns. This is the primary reason why we were failing to match some of the 2-and-2 floating point shuffles such as PR21140. Even after fixing this we need to support some extra patterns in the backend in order to match the resulting X86ISD::UNPCKL nodes into the correct instructions. This commit should fix PR21140 and includes more comprehensive testing of insertion patterns in v4 shuffles. Not all of the added tests are beautiful. For example, we don't have clever instructions to insert-via-load in the integer domain. There are also some places where we aren't sufficiently cunning with our use of movq and movd, but that's future work. llvm-svn: 218911
* LTO: Document the Boolean argument from r218784Duncan P. N. Exon Smith2014-10-021-1/+2
| | | | llvm-svn: 218907
* Optimize square root squared (PR21126).Sanjay Patel2014-10-021-0/+5
| | | | | | | | | | | When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X. This can happen in the real world when calculating x ** 3/2. This occurs in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c. Differential Revision: http://reviews.llvm.org/D5584 llvm-svn: 218906
* InstrProf: Avoid linear search in a hot loopJustin Bogner2014-10-021-5/+6
| | | | | | | | | | Every time we were adding or removing an expression when generating a coverage mapping we were doing a linear search to try and deduplicate the list. The indices in the list are important, so we can't just replace it by a DenseMap entirely, but an auxilliary DenseMap for fast lookup massively improves the performance issues I was seeing here. llvm-svn: 218892
* This patch adds a new flag "-coff-imports" to llvm-readobj.Rui Ueyama2014-10-021-5/+18
| | | | | | | | | | | | | | When the flag is given, the command prints out the COFF import table. Currently only the import table directory will be printed. I'm going to make another patch to print out the imported symbols. The implementation of import directory entry iterator in COFFObjectFile.cpp was buggy. This patch fixes that too. http://reviews.llvm.org/D5569 llvm-svn: 218891
* Reapply "InstrProf: Don't keep a large sparse list around just to zero it"Justin Bogner2014-10-021-24/+43
| | | | | | | | | | When I was preparing r218879 for commit, I removed an early return that I decided was just noise. It wasn't. This is r218879 no-crash edition. This reverts commit r218881, reapplying r218879. llvm-svn: 218887
* Remove an extra whitespace.Adrian Prantl2014-10-021-1/+1
| | | | llvm-svn: 218886
* Pretty-printer: Paper over an ambiguity between line table entriesAdrian Prantl2014-10-021-1/+3
| | | | | | | | and tagged mdnodes. fixes http://llvm.org/bugs/show_bug.cgi?id=21131 llvm-svn: 218885
* Revert "InstrProf: Don't keep a large sparse list around just to zero it"Justin Bogner2014-10-021-38/+24
| | | | | | | | This seems to be crashing on some buildbots. Reverting to investigate. This reverts commit r218879. llvm-svn: 218881
* InstrProf: Don't keep a large sparse list around just to zero itJustin Bogner2014-10-021-24/+38
| | | | | | | | | | | | | | | | | | | | The Terms vector here represented a polynomial of of all possible counters, and is used to simplify expressions when generating coverage mapping. There are a few problems with this: 1. Keeping the vector as a member is wasteful, since we clear it every time we use it. 2. Most expressions refer to a subset of the counters, so we end up iterating over a large number of zeros doing nothing a lot of the time. This updates the user of the vector to store the terms locally, and uses a sort and combine approach so that we only operate on counters that are actually used in a given expression. For small cases this makes very little difference, but in cases with a very large number of counted regions this is a significant performance fix. llvm-svn: 218879
* Use the local variable that other clauses around here are already using.Sanjay Patel2014-10-021-1/+1
| | | | llvm-svn: 218876
* Remove duplicate function names from comments. NFC.Sanjay Patel2014-10-021-43/+35
| | | | llvm-svn: 218875
* [NVPTX] Remove dead code.Tilmann Scheller2014-10-021-9/+3
| | | | | | Found by the Clang static analyzer. llvm-svn: 218874
* Support padding unaligned data in .text.Joerg Sonnenberger2014-10-021-1/+6
| | | | llvm-svn: 218870
* Silence a -Wsign-compare warning. NFC.Aaron Ballman2014-10-021-2/+2
| | | | llvm-svn: 218868
* [BUG][INDVAR] Fix for PR21014: wrong SCEV operands commuting for ↵Zinovy Nis2014-10-021-3/+12
| | | | | | | | | | | | non-commutative instructions My commit rL216160 introduced a bug PR21014: IndVars widens code 'for (i = ; i < ...; i++) arr[ CONST - i]' into 'for (i = ; i < ...; i++) arr[ i - CONST]' thus inverting index expression. This patch fixes it. Thanks to Jörg Sonnenberger for pointing. Differential Revision: http://reviews.llvm.org/D5576 llvm-svn: 218867
* InstrProf: Simplify counting a file's regions when writing coverage (NFC)Justin Bogner2014-10-021-34/+24
| | | | | | | | | | | | When writing a coverage mapping we iterate through the mapping regions in order of FileID, but we were then repeatedly searching from the beginning of the list to count the number of regions with a given FileID. It is simpler and more efficient to search forward from the current iterator to find the number of regions. llvm-svn: 218842
* [x86] Improve and correct how the new vector shuffle lowering wasChandler Carruth2014-10-011-8/+32
| | | | | | | | | | | | | | | | | | | matching and lowering 64-bit insertions. The first problem was that we weren't looking through bitcasts to discover that we *could* lower as insertions. Once fixed, we in turn weren't looking through bitcasts to discover that we could fold a load into the lowering. Once fixed, we weren't forming a SCALAR_TO_VECTOR node around the inserted element and instead were passing a scalar to a DAG node that expected a vector. It turns out there are some patterns that will "lower" this into the correct asm, but the rest of the X86 backend is very unhappy with such antics. This should fix a few more edge case regressions I've spotted going through the regression test suite to enable the new vector shuffle lowering. llvm-svn: 218839
* [MCJIT] Don't crash in debugging output for sections that aren't emitted.Lang Hames2014-10-011-0/+5
| | | | llvm-svn: 218836
* constify the TargetMachine argument used in the subtarget andEric Christopher2014-10-014-4/+4
| | | | | | lowering constructors. llvm-svn: 218832
* DIBuilder: Remove duplicated comments, NFCDuncan P. N. Exon Smith2014-10-011-77/+1
| | | | | | | These comments already appear in the header, and some of them are out-of-date anyway. llvm-svn: 218829
* Revert "DIBuilder: Remove dead code"Duncan P. N. Exon Smith2014-10-011-0/+6
| | | | | | | | | This reverts commit r218820. It turns out that Adrian has an outstanding SROA patch that uses this. I've updated it to forward to `createExpression()`. llvm-svn: 218828
OpenPOWER on IntegriCloud