summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Do not destroy external linkage when deleting function bodyPetar Jovanovic2014-09-231-1/+1
| | | | | | | | | | | The function deleteBody() converts the linkage to external and thus destroys original linkage type value. Lack of correct linkage type causes wrong relocations to be emitted later. Calling dropAllReferences() instead of deleteBody() will fix the issue. Differential Revision: http://reviews.llvm.org/D5415 llvm-svn: 218302
* [x86] Teach the vector comment parsing and printing to correctly handleChandler Carruth2014-09-235-56/+113
| | | | | | | | | | | | | | | | | undef in the shuffle mask. This shows up when we're printing comments during lowering and we still have an IR-level constant hanging around that models undef. A nice consequence of this is *much* prettier test cases where the undef lanes actually show up as undef rather than as a particular set of values. This also allows us to print shuffle comments in cases that use undef such as the recently added variable VPERMILPS lowering. Now those test cases have nice shuffle comments attached with their details. The shuffle lowering for PSHUFB has been augmented to use undef, and the shuffle combining has been augmented to comprehend it. llvm-svn: 218301
* [x86] Teach the AVX1 path of the new vector shuffle lowering one moreChandler Carruth2014-09-237-25/+78
| | | | | | | | | | | | | | | | | | | | | | trick that I missed. VPERMILPS has a non-immediate memory operand mode that allows it to do asymetric shuffles in the two 128-bit lanes. Use this rather than two shuffles and a blend. However, it turns out the variable shuffle path to VPERMILPS (and VPERMILPD, although that one offers no functional differenc from the immediate operand other than variability) wasn't even plumbed through codegen. Do such plumbing so that we can reasonably emit a variable-masked VPERMILP instruction. Also plumb basic comment parsing and printing through so that the tests are reasonable. There are still a few tests which don't show the shuffle pattern. These are tests with undef lanes. I'll teach the shuffle decoding and printing to handle undef mask entries in a follow-up. I've looked at the masks and they seem reasonable. llvm-svn: 218300
* Windows/DynamicLibrary.inc: Remove 'extern "C"' in ELM_Callback.NAKAMURA Takumi2014-09-231-1/+1
| | | | | | 'extern "C" static' is not accepted by g++-4.7. Rather to tweak, I just removed 'extern "C"', since it doesn't affect the ABI. llvm-svn: 218290
* Converting terminalHasColors mutex to a global ManagedStatic to avoid the ↵Chris Bieneman2014-09-221-2/+4
| | | | | | static destructor. llvm-svn: 218283
* [x86] Rename X86ISD::VPERMILP to X86ISD::VPERMILPI (and the same for theChandler Carruth2014-09-225-30/+30
| | | | | | | | td pattern). Currently we only model the immediate operand variation of VPERMILPS and VPERMILPD, we should make that clear in the pseudos used. Will be adding support for the variable mask variant in my next commit. llvm-svn: 218282
* Fix a "typo" from my previous commit.Kaelyn Takata2014-09-221-1/+1
| | | | llvm-svn: 218281
* Silence unused variable warnings in the new stub functions that occurKaelyn Takata2014-09-221-1/+3
| | | | | | when assertions are disabled. llvm-svn: 218280
* [x86] Stub out the integer lowering of 256-bit vectors with AVX2Chandler Carruth2014-09-221-4/+87
| | | | | | | support. No interesting functionality yet, but this will let me implement one vector type at a time. llvm-svn: 218277
* In this callback ModuleName includes the file path.Yaron Keren2014-09-221-26/+5
| | | | | | | | | | | | | | | Comparing ModuleName to the file names listed will always fail. I wonder how this code ever worked and what its purpose was. Why exclude the msvc runtime DLLs but not exclude all Windows system DLLs? Anyhow, it does not function as intended. clang-formatted as well. llvm-svn: 218276
* [FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left ↵Juergen Ributzka2014-09-221-2/+3
| | | | | | | | | | | for booleans (i1). Shift-left immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. This should fix a bug found by Chad. llvm-svn: 218275
* ms-inline-asm: Fix parsing label names inside bracket expressionsEhsan Akhgari2014-09-221-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes a couple of issues. One is ensuring that AOK_Label rewrite rules have a lower priority than AOK_Skip rules, as AOK_Skip needs to be able to skip the brackets properly. The other part of the fix ensures that we don't overwrite Identifier when looking up the identifier, and that we use the locally available information to generate the AOK_Label rewrite in ParseIntelIdentifier. Doing that in CreateMemForInlineAsm would be problematic since the Start location there may point to the beginning of a bracket expression, and not necessarily the beginning of an identifier. This also means that we don't need to carry around the InternlName field, which helps simplify the code. Test Plan: This will be tested on the clang side. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5445 llvm-svn: 218270
* MC: ReadOnlyWithRel section kinds should map to rdata in COFFDavid Majnemer2014-09-221-4/+4
| | | | | | | Don't consider ReadOnlyWithRel as a writable section in COFF, they really belong in .rdata. llvm-svn: 218268
* Use broadcasts to optimize overall size when loading constant splat vectors ↵Sanjay Patel2014-09-222-7/+33
| | | | | | | | | | | | | | | (x86-64 with AVX or AVX2). We generate broadcast instructions on CPUs with AVX2 to load some constant splat vectors. This patch should preserve all existing behavior with regular optimization levels, but also use splats whenever possible when optimizing for *size* on any CPU with AVX or AVX2. The tradeoff is up to 5 extra instruction bytes for the broadcast instruction to save at least 8 bytes (up to 31 bytes) of constant pool data. Differential Revision: http://reviews.llvm.org/D5347 llvm-svn: 218263
* Revert "R600/SI: Add support for global atomic add"Tom Stellard2014-09-224-111/+3
| | | | | | | | | This reverts commit r218254. The global_atomics.ll test fails with asserts disabled. For some reason, the compiler fails to produce the atomic no return variants. llvm-svn: 218257
* R600/SI: Add support for global atomic addTom Stellard2014-09-224-3/+111
| | | | llvm-svn: 218254
* R600/SI: Remove modifier operands from V_CNDMASK_B32_e64Tom Stellard2014-09-222-8/+3
| | | | | | Modifiers don't work for this instruction. llvm-svn: 218253
* R600: Don't set BypassSlowDiv for 64-bit divisionTom Stellard2014-09-221-3/+0
| | | | | | | | | | | | | BypassSlowDiv is used by codegen prepare to insert a run-time check to see if the operands to a 64-bit division are really 32-bit values and if they are it will do 32-bit division instead. This is not useful for R600, which has predicated control flow since both the 32-bit and 64-bit paths will be executed in most cases. It also increases code size which can lead to more instruction cache misses. llvm-svn: 218252
* R600/SI: Use ISD::MUL instead of ISD::UMULO when lowering divisionTom Stellard2014-09-221-3/+3
| | | | | | | ISD::MUL and ISD:UMULO are the same except that UMULO sets an overflow bit. Since we aren't using the overflow bit, we should use ISD::MUL. llvm-svn: 218251
* R600/SI: Add enums for some hard-coded valuesTom Stellard2014-09-224-26/+87
| | | | llvm-svn: 218250
* [x32] Fix segmented stacks supportPavel Chupin2014-09-227-41/+65
| | | | | | | | | | | | | | | | Summary: Update segmented-stacks*.ll tests with x32 target case and make corresponding changes to make them pass. Test Plan: tests updated with x32 target Reviewers: nadav, rafael, dschuff Subscribers: llvm-commits, zinovy.nis Differential Revision: http://reviews.llvm.org/D5245 llvm-svn: 218247
* [dwarfdump] Dump full filenames as DW_AT_(decl|call)_file attribute valuesFrederic Riss2014-09-222-5/+15
| | | | | | | | | | Reviewers: dblaikie samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5192 llvm-svn: 218246
* Allow DWARFDebugInfoEntryMinimal::getSubroutineName to resolve cross-unit ↵Frederic Riss2014-09-221-4/+16
| | | | | | | | | | | | | | references. Summary: getSubroutineName is currently only used by llvm-symbolizer, thus add a binary test containing a cross-cu inlining example. Reviewers: samsonov, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5394 llvm-svn: 218245
* Fix assert when decoding PSHUFB maskRobert Lougher2014-09-221-6/+4
| | | | | | | | | | The PSHUFB mask decode routine used to assert if the mask index was out of range (<0 or greater than the size of the vector). The problem is, we can legitimately have a PSHUFB with a large index using intrinsics. The instruction only uses the least significant 4 bits. This change removes the assert and masks the index to match the instruction behaviour. llvm-svn: 218242
* Downgrade DWARF2 section limit error to a warningOliver Stannard2014-09-222-5/+7
| | | | | | | | | We currently emit an error when trying to assemble a file with more than one section using DWARF2 debug info. This should be a warning instead, as the resulting file will still be usable, but with a degraded debug illusion. llvm-svn: 218241
* Add two thresholds lvi-overdefined-BB-threshold and lvi-overdefined-thresholdJiangning Liu2014-09-221-2/+32
| | | | | | | | | | for LVI algorithm. For a specific value to be lowered, when the number of basic blocks being checked for overdefined lattice value is larger than lvi-overdefined-BB-threshold, or the times of encountering overdefined value for a single basic block is larger than lvi-overdefined-threshold, the LVI algorithm will stop further lowering the lattice value. llvm-svn: 218231
* ms-inline-asm: Add a sema callback for looking up label namesEhsan Akhgari2014-09-222-8/+36
| | | | | | | | | | | | | | | The implementation of the callback in clang's Sema will return an internal name for labels. Test Plan: Will be tested in clang. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4587 llvm-svn: 218229
* [x86] Back out a bad choice about lowering v4i64 and pave the way forChandler Carruth2014-09-221-44/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a more sane approach to AVX2 support. Fundamentally, there is no useful way to lower integer vectors in AVX. None. We always end up with a VINSERTF128 in the end, so we might as well eagerly switch to the floating point domain and do everything there. This cleans up lots of weird and unlikely to be correct differences between integer and floating point shuffles when we only have AVX1. The other nice consequence is that by doing things this way we will make it much easier to write the integer lowering routines as we won't need to duplicate the logic to check for AVX vs. AVX2 in each one -- if we actually try to lower a 256-bit vector as an integer vector, we have AVX2 and can rely on it. I think this will make the code much simpler and more comprehensible. Currently, I've disabled *all* support for AVX2 so that we always fall back to AVX. This keeps everything working rather than asserting. That will go away with the subsequent series of patches that provide a baseline AVX2 implementation. Please note, I'm going to implement AVX2 *without access to hardware*. That means I cannot correctness test this path. I will be relying on those with access to AVX2 hardware to do correctness testing and fix bugs here, but as a courtesy I'm trying to sketch out the framework for the new-style vector shuffle lowering in the context of the AVX2 ISA. llvm-svn: 218228
* [x86] Teach the new vector shuffle lowering how to cleverly lower singleChandler Carruth2014-09-211-3/+22
| | | | | | | | | | | | | input v8f32 shuffles which are not 128-bit lane crossing but have different shuffle patterns in the low and high lanes. This removes most of the extract/insert traffic that was unnecessary and is particularly good at lowering cases where only one of the two lanes is shuffled at all. I've also added a collection of test cases with undef lanes because this lowering is somewhat more sensitive to undef lanes than others. llvm-svn: 218226
* Fix typoMatt Arsenault2014-09-211-4/+3
| | | | llvm-svn: 218223
* Use llvm_unreachable instead of assert(!)Matt Arsenault2014-09-211-2/+2
| | | | llvm-svn: 218222
* R600/SI: Don't use strings for single charactersMatt Arsenault2014-09-211-18/+18
| | | | llvm-svn: 218221
* Remove redundant if test.Lang Hames2014-09-211-4/+1
| | | | llvm-svn: 218220
* Refactor reciprocal square root estimate into target-independent function; NFC.Sanjay Patel2014-09-213-58/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | This is purely a plumbing patch. No functional changes intended. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . The first step is to add a target hook for RSQRTE, take the already target-independent code selfishly hoarded by PPC, and put it into DAGCombiner. Next steps: The code in DAGCombiner::BuildRSQRTE() should be refactored further; tests that exercise that logic need to be added. Logic in PPCTargetLowering::BuildRSQRTE() should be hoisted into DAGCombiner. X86 and AArch64 overrides for TargetLowering.BuildRSQRTE() should be added. Differential Revision: http://reviews.llvm.org/D5425 llvm-svn: 218219
* mop up: "Don’t duplicate function or class name at the beginning of the ↵Sanjay Patel2014-09-213-59/+45
| | | | | | comment." llvm-svn: 218218
* [x86] With the stronger canonicalization of shuffles added in r218216,Chandler Carruth2014-09-211-4/+0
| | | | | | | the new vector shuffle lowering no longer needs to check both symmetric forms of UNPCK patterns for v4f64. llvm-svn: 218217
* [x86] Teach the new vector shuffle lowering to re-use the SHUFPSChandler Carruth2014-09-211-4/+23
| | | | | | | | | | | | | lowering when it can use a symmetric SHUFPS across both 128-bit lanes. This required making the SHUFPS lowering tolerant of other vector types, and adjusting our canonicalization to canonicalize harder. This is the last of the clever uses of symmetry I've thought of for v8f32. The rest of the tricks I'm aware of here are to work around assymetry in the mask. llvm-svn: 218216
* [x86] Refactor the logic to form SHUFPS instruction patterns to lowerChandler Carruth2014-09-211-89/+108
| | | | | | | | | | a generic vector shuffle mask into a helper that isn't specific to the other things that influence which choice is made or the specific types used with the instruction. No functionality changed. llvm-svn: 218215
* [x86] Teach the new vector shuffle lowering the basics about insertionChandler Carruth2014-09-211-0/+18
| | | | | | | | of a single element into a zero vector for v4f64 and v4i64 in AVX. Ironically, there is less to see here because xor+blend is so crazy fast that we can't really beat that to zero the high 128-bit lane. llvm-svn: 218214
* [x86] Teach the new vector shuffle lowering how to lower to UNPCKLPS andChandler Carruth2014-09-211-2/+6
| | | | | | | | | | | | | UNPCKHPS with AVX vectors by recognizing those patterns when they are repeated for both 128-bit lanes. With this, we now generate the exact same (really nice) code for Quentin's avx_test_case.ll which was the most significant regression reported for the new shuffle lowering. In fact, I'm out of specific test cases for AVX lowering, the rest were AVX2 I think. However, there are a bunch of pretty obvious remaining things to improve with AVX... llvm-svn: 218213
* [x86] Begin teaching the new vector shuffle lowering among the mostChandler Carruth2014-09-211-2/+28
| | | | | | | | | | important bits of cleverness: to detect and lower repeated shuffle patterns between the two 128-bit lanes with a single instruction. This patch just teaches it how to lower single-input shuffles that fit this model using VPERMILPS. =] There is more that needs to happen here. llvm-svn: 218211
* [x86] Explicitly lower to a blend early if it is trivial to do so forChandler Carruth2014-09-211-0/+5
| | | | | | | | | | | | v8f32 shuffles in the new vector shuffle lowering code. This is very cheap to do and makes it much more clear that anything more expensive but overlapping with this lowering should be selected afterward (for example using AVX2's VPERMPS). However, no functionality changed here as without this code we would fall through to create no-op shuffles of each input and a blend. =] llvm-svn: 218209
* [x86] Teach the new vector shuffle lowering of v4f64 to prefer a directChandler Carruth2014-09-211-0/+5
| | | | | | | | | | VBLENDPD over using VSHUFPD. While the 256-bit variant of VBLENDPD slows down to the same speed as VSHUFPD on Sandy Bridge CPUs, it has twice the reciprocal throughput on Ivy Bridge CPUs much like it does everywhere for 128-bits. There isn't a downside, so just eagerly use this instruction when it suffices. llvm-svn: 218208
* [x86] Switch the blend implementation to use a MVT switch rather thanChandler Carruth2014-09-211-18/+25
| | | | | | | | | awkward conditions. The readability improvement of this will be even more important as I generalize it to handle more types. No functionality changed. llvm-svn: 218205
* [x86] Remove some essentially lying comments from the v4f64 path of theChandler Carruth2014-09-211-6/+0
| | | | | | new vector shuffle lowering. llvm-svn: 218204
* [x86] Fix a helper to reflect that what we actually care about isChandler Carruth2014-09-211-9/+12
| | | | | | | | | | 128-bit lane crossings, not 'half' crossings. This came up in code review ages ago, but I hadn't really addresesd it. Also added some documentation for the helper. No functionality changed. llvm-svn: 218203
* [x86] Teach the new vector shuffle lowering the first step toward moreChandler Carruth2014-09-211-1/+40
| | | | | | | | | | | | actual support for complex AVX shuffling tricks. We can do independent blends of the low and high 128-bit lanes of an avx vector, so shuffle the inputs into place and then do the blend at 256 bits. This will in many cases remove one blend instruction. The next step is to permute the low and high halves in-place rather than extracting them and re-inserting them. llvm-svn: 218202
* MC: Support aligned COMMON symbols for COFFDavid Majnemer2014-09-212-5/+16
| | | | | | | | | | | | | | | | | | | | | link.exe: Fuzz testing has shown that COMMON symbols with size > 32 will always have an alignment of at least 32 and all symbols with size < 32 will have an alignment of at least the largest power of 2 less than the size of the symbol. binutils: The BFD linker essentially work like the link.exe behavior but with alignment 4 instead of 32. The BFD linker also supports an extension to COFF which adds an -aligncomm argument to the .drectve section which permits specifying a precise alignment for a variable but MC currently doesn't support editing .drectve in this way. With all of this in mind, we decide to play a little trick: we can ensure that the alignment will be respected by bumping the size of the global to it's alignment. llvm-svn: 218201
* RTDyldMemoryManager::getSymbolAddress(): Make sure to return 0 if symbol ↵NAKAMURA Takumi2014-09-201-0/+2
| | | | | | name is not met. [-Wreturn-type] llvm-svn: 218195
* mop up: "Don’t duplicate function or class name at the beginning of the ↵Sanjay Patel2014-09-201-25/+21
| | | | | | comment." llvm-svn: 218194
OpenPOWER on IntegriCloud