summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* AArch64: allow constant expressions for shifted reg literalsJim Grosbach2014-09-231-6/+7
| | | | | | | | | | | | e.g., add w1, w2, w3, lsl #(2 - 1) This sort of thing comes up in pre-processed assembly playing macro games. Still validate that it's an assembly time constant. The early exit error check was just a bit overzealous and disallowed a left paren. rdar://18430542 llvm-svn: 218336
* [x86] Teach the rest of the 'target shuffle' machinery about blends andChandler Carruth2014-09-232-1/+30
| | | | | | | | | | | add VPBLENDD to the InstPrinter's comment generation so we get nice comments everywhere. Now that we have the nice comments, I can see the bug introduced by a silly typo in the commit that enabled VPBLENDD, and have fixed it. Yay tests that are easy to inspect. llvm-svn: 218335
* R600/SI: Clean up checks for legality of immediate operandsTom Stellard2014-09-238-67/+149
| | | | | | | | | | | | | | There are new register classes VCSrc_* which represent operands that can take an SGPR, VGPR or inline constant. The VSrc_* class is now used to represent operands that can take an SGPR, VGPR, or a 32-bit immediate. This allows us to have more accurate checks for legality of immediates, since before we had no way to distinguish between operands that supported any 32-bit immediate and operands which could only support inline constants. llvm-svn: 218334
* [X86] Make wide loads be managed by AtomicExpandRobin Morisset2014-09-232-28/+35
| | | | | | | | | | | | | | | | | | | | | | | Summary: AtomicExpand already had logic for expanding wide loads and stores on LL/SC architectures, and for expanding wide stores on CmpXchg architectures, but not for wide loads on CmpXchg architectures. This patch fills this hole, and makes use of this new feature in the X86 backend. Only one functionnal change: we now lose the SynchScope attribute. It is regrettable, but I have another patch that I will submit soon that will solve this for all of AtomicExpand (it seemed better to split it apart as it is a different concern). Test Plan: make check-all (lots of tests for this functionality already exist) Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5404 llvm-svn: 218332
* [Power] Use AtomicExpandPass for fence insertion, and use lwsync where ↵Robin Morisset2014-09-234-2/+50
| | | | | | | | | | | | | | | | | | | | | | | appropriate Summary: This patch makes use of AtomicExpandPass in Power for inserting fences around atomic as part of an effort to remove fence insertion from SelectionDAGBuilder. As a big bonus, it lets us use sync 1 (lightweight sync, often used by the mnemonic lwsync) instead of sync 0 (heavyweight sync) in many cases. I also added a test, as there was no test for the barriers emitted by the Power backend for atomic loads and stores. Test Plan: new test + make check-all Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5180 llvm-svn: 218331
* Add AtomicExpandPass::bracketInstWithFences, and use it whenever ↵Robin Morisset2014-09-233-54/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getInsertFencesForAtomic would trigger in SelectionDAGBuilder Summary: The goal is to eventually remove all the code related to getInsertFencesForAtomic in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works mostly by accident because the backends are overly conservative), and repeats the same logic that goes in emitLeading/TrailingFence. In this patch, I make AtomicExpandPass insert the fences as it knows better where to put them. Because this requires getting the fences and not just passing an IRBuilder around, I had to change the return type of emitLeading/TrailingFence. This code only triggers on ARM for now. Because it is earlier in the pipeline than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so SelectionDAGBuilder does not add barriers anymore on ARM. If this patch is accepted I plan to implement emitLeading/TrailingFence for all backends that setInsertFencesForAtomic(true), which will allow both making them less conservative and simplifying SelectionDAGBuilder once they are all using this interface. This should not cause any functionnal change so the existing tests are used and not modified. Test Plan: make check-all, benefits from existing tests of atomics on ARM Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5179 llvm-svn: 218329
* [MCJIT] Fix some more RuntimeDyld debugging output format specifiers.Lang Hames2014-09-231-3/+3
| | | | llvm-svn: 218328
* [MCJIT] Remove PPCRelocations.h - it's no longer used.Lang Hames2014-09-231-56/+0
| | | | | | | This was overlooked in r218320, which removed the relocation headers for other targets. Thanks to Ulrich Weigand for catching it. llvm-svn: 218327
* Just add a fixme about a possibly faster implementation of some atomic loads ↵Robin Morisset2014-09-231-0/+3
| | | | | | on some ARM processors llvm-svn: 218326
* Fix typoMatt Arsenault2014-09-231-2/+3
| | | | llvm-svn: 218324
* [x86] Teach the new shuffle lowering's blend functionality to use AVX2'sChandler Carruth2014-09-231-16/+35
| | | | | | | | | | | | | VPBLENDD where appropriate even on 128-bit vectors. According to Agner's tables, this instruction is significantly higher throughput (can execute on any port) on Haswell chips so we should aggressively try to form it when available. Sadly, this loses our delightful shuffle comments. I'll add those back for VPBLENDD next. llvm-svn: 218322
* [MCJIT] Nuke MachineRelocation and MachineCodeEmitter. Now that the old JIT isLang Hames2014-09-236-226/+0
| | | | | | gone they're no longer needed. llvm-svn: 218320
* [MCJIT] Remove a few more references to JITMemoryManager that survived r218316.Lang Hames2014-09-231-1/+0
| | | | llvm-svn: 218318
* [MCJIT] Remove #include of JITMemoryManager that accidentally survived r218316.Lang Hames2014-09-231-1/+0
| | | | llvm-svn: 218317
* [MCJIT] Delete the JTIMemoryManager and associated APIs.Lang Hames2014-09-233-907/+4
| | | | | | | | | | This patch removes the old JIT memory manager (which does not provide any useful functionality now that the old JIT is gone), and migrates the few remaining clients over to SectionMemoryManager. http://llvm.org/PR20848 llvm-svn: 218316
* Use SDValue bool operator to reduce code. No functional change.Sanjay Patel2014-09-231-9/+6
| | | | llvm-svn: 218314
* Fix segfault in AArch64 backend with -g and -mbig-endianOliver Stannard2014-09-231-2/+2
| | | | | | | Fix a null pointer dereference when trying to swap the endianness of fixups in the .eh_frame section in the AArch64 backend. llvm-svn: 218311
* Loop instead of individual def's for each GPR.Sid Manning2014-09-231-32/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D5450 llvm-svn: 218305
* Do not destroy external linkage when deleting function bodyPetar Jovanovic2014-09-231-1/+1
| | | | | | | | | | | The function deleteBody() converts the linkage to external and thus destroys original linkage type value. Lack of correct linkage type causes wrong relocations to be emitted later. Calling dropAllReferences() instead of deleteBody() will fix the issue. Differential Revision: http://reviews.llvm.org/D5415 llvm-svn: 218302
* [x86] Teach the vector comment parsing and printing to correctly handleChandler Carruth2014-09-235-56/+113
| | | | | | | | | | | | | | | | | undef in the shuffle mask. This shows up when we're printing comments during lowering and we still have an IR-level constant hanging around that models undef. A nice consequence of this is *much* prettier test cases where the undef lanes actually show up as undef rather than as a particular set of values. This also allows us to print shuffle comments in cases that use undef such as the recently added variable VPERMILPS lowering. Now those test cases have nice shuffle comments attached with their details. The shuffle lowering for PSHUFB has been augmented to use undef, and the shuffle combining has been augmented to comprehend it. llvm-svn: 218301
* [x86] Teach the AVX1 path of the new vector shuffle lowering one moreChandler Carruth2014-09-237-25/+78
| | | | | | | | | | | | | | | | | | | | | | trick that I missed. VPERMILPS has a non-immediate memory operand mode that allows it to do asymetric shuffles in the two 128-bit lanes. Use this rather than two shuffles and a blend. However, it turns out the variable shuffle path to VPERMILPS (and VPERMILPD, although that one offers no functional differenc from the immediate operand other than variability) wasn't even plumbed through codegen. Do such plumbing so that we can reasonably emit a variable-masked VPERMILP instruction. Also plumb basic comment parsing and printing through so that the tests are reasonable. There are still a few tests which don't show the shuffle pattern. These are tests with undef lanes. I'll teach the shuffle decoding and printing to handle undef mask entries in a follow-up. I've looked at the masks and they seem reasonable. llvm-svn: 218300
* Windows/DynamicLibrary.inc: Remove 'extern "C"' in ELM_Callback.NAKAMURA Takumi2014-09-231-1/+1
| | | | | | 'extern "C" static' is not accepted by g++-4.7. Rather to tweak, I just removed 'extern "C"', since it doesn't affect the ABI. llvm-svn: 218290
* Converting terminalHasColors mutex to a global ManagedStatic to avoid the ↵Chris Bieneman2014-09-221-2/+4
| | | | | | static destructor. llvm-svn: 218283
* [x86] Rename X86ISD::VPERMILP to X86ISD::VPERMILPI (and the same for theChandler Carruth2014-09-225-30/+30
| | | | | | | | td pattern). Currently we only model the immediate operand variation of VPERMILPS and VPERMILPD, we should make that clear in the pseudos used. Will be adding support for the variable mask variant in my next commit. llvm-svn: 218282
* Fix a "typo" from my previous commit.Kaelyn Takata2014-09-221-1/+1
| | | | llvm-svn: 218281
* Silence unused variable warnings in the new stub functions that occurKaelyn Takata2014-09-221-1/+3
| | | | | | when assertions are disabled. llvm-svn: 218280
* [x86] Stub out the integer lowering of 256-bit vectors with AVX2Chandler Carruth2014-09-221-4/+87
| | | | | | | support. No interesting functionality yet, but this will let me implement one vector type at a time. llvm-svn: 218277
* In this callback ModuleName includes the file path.Yaron Keren2014-09-221-26/+5
| | | | | | | | | | | | | | | Comparing ModuleName to the file names listed will always fail. I wonder how this code ever worked and what its purpose was. Why exclude the msvc runtime DLLs but not exclude all Windows system DLLs? Anyhow, it does not function as intended. clang-formatted as well. llvm-svn: 218276
* [FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left ↵Juergen Ributzka2014-09-221-2/+3
| | | | | | | | | | | for booleans (i1). Shift-left immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. This should fix a bug found by Chad. llvm-svn: 218275
* ms-inline-asm: Fix parsing label names inside bracket expressionsEhsan Akhgari2014-09-221-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes a couple of issues. One is ensuring that AOK_Label rewrite rules have a lower priority than AOK_Skip rules, as AOK_Skip needs to be able to skip the brackets properly. The other part of the fix ensures that we don't overwrite Identifier when looking up the identifier, and that we use the locally available information to generate the AOK_Label rewrite in ParseIntelIdentifier. Doing that in CreateMemForInlineAsm would be problematic since the Start location there may point to the beginning of a bracket expression, and not necessarily the beginning of an identifier. This also means that we don't need to carry around the InternlName field, which helps simplify the code. Test Plan: This will be tested on the clang side. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5445 llvm-svn: 218270
* MC: ReadOnlyWithRel section kinds should map to rdata in COFFDavid Majnemer2014-09-221-4/+4
| | | | | | | Don't consider ReadOnlyWithRel as a writable section in COFF, they really belong in .rdata. llvm-svn: 218268
* Use broadcasts to optimize overall size when loading constant splat vectors ↵Sanjay Patel2014-09-222-7/+33
| | | | | | | | | | | | | | | (x86-64 with AVX or AVX2). We generate broadcast instructions on CPUs with AVX2 to load some constant splat vectors. This patch should preserve all existing behavior with regular optimization levels, but also use splats whenever possible when optimizing for *size* on any CPU with AVX or AVX2. The tradeoff is up to 5 extra instruction bytes for the broadcast instruction to save at least 8 bytes (up to 31 bytes) of constant pool data. Differential Revision: http://reviews.llvm.org/D5347 llvm-svn: 218263
* Revert "R600/SI: Add support for global atomic add"Tom Stellard2014-09-224-111/+3
| | | | | | | | | This reverts commit r218254. The global_atomics.ll test fails with asserts disabled. For some reason, the compiler fails to produce the atomic no return variants. llvm-svn: 218257
* R600/SI: Add support for global atomic addTom Stellard2014-09-224-3/+111
| | | | llvm-svn: 218254
* R600/SI: Remove modifier operands from V_CNDMASK_B32_e64Tom Stellard2014-09-222-8/+3
| | | | | | Modifiers don't work for this instruction. llvm-svn: 218253
* R600: Don't set BypassSlowDiv for 64-bit divisionTom Stellard2014-09-221-3/+0
| | | | | | | | | | | | | BypassSlowDiv is used by codegen prepare to insert a run-time check to see if the operands to a 64-bit division are really 32-bit values and if they are it will do 32-bit division instead. This is not useful for R600, which has predicated control flow since both the 32-bit and 64-bit paths will be executed in most cases. It also increases code size which can lead to more instruction cache misses. llvm-svn: 218252
* R600/SI: Use ISD::MUL instead of ISD::UMULO when lowering divisionTom Stellard2014-09-221-3/+3
| | | | | | | ISD::MUL and ISD:UMULO are the same except that UMULO sets an overflow bit. Since we aren't using the overflow bit, we should use ISD::MUL. llvm-svn: 218251
* R600/SI: Add enums for some hard-coded valuesTom Stellard2014-09-224-26/+87
| | | | llvm-svn: 218250
* [x32] Fix segmented stacks supportPavel Chupin2014-09-227-41/+65
| | | | | | | | | | | | | | | | Summary: Update segmented-stacks*.ll tests with x32 target case and make corresponding changes to make them pass. Test Plan: tests updated with x32 target Reviewers: nadav, rafael, dschuff Subscribers: llvm-commits, zinovy.nis Differential Revision: http://reviews.llvm.org/D5245 llvm-svn: 218247
* [dwarfdump] Dump full filenames as DW_AT_(decl|call)_file attribute valuesFrederic Riss2014-09-222-5/+15
| | | | | | | | | | Reviewers: dblaikie samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5192 llvm-svn: 218246
* Allow DWARFDebugInfoEntryMinimal::getSubroutineName to resolve cross-unit ↵Frederic Riss2014-09-221-4/+16
| | | | | | | | | | | | | | references. Summary: getSubroutineName is currently only used by llvm-symbolizer, thus add a binary test containing a cross-cu inlining example. Reviewers: samsonov, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5394 llvm-svn: 218245
* Fix assert when decoding PSHUFB maskRobert Lougher2014-09-221-6/+4
| | | | | | | | | | The PSHUFB mask decode routine used to assert if the mask index was out of range (<0 or greater than the size of the vector). The problem is, we can legitimately have a PSHUFB with a large index using intrinsics. The instruction only uses the least significant 4 bits. This change removes the assert and masks the index to match the instruction behaviour. llvm-svn: 218242
* Downgrade DWARF2 section limit error to a warningOliver Stannard2014-09-222-5/+7
| | | | | | | | | We currently emit an error when trying to assemble a file with more than one section using DWARF2 debug info. This should be a warning instead, as the resulting file will still be usable, but with a degraded debug illusion. llvm-svn: 218241
* Add two thresholds lvi-overdefined-BB-threshold and lvi-overdefined-thresholdJiangning Liu2014-09-221-2/+32
| | | | | | | | | | for LVI algorithm. For a specific value to be lowered, when the number of basic blocks being checked for overdefined lattice value is larger than lvi-overdefined-BB-threshold, or the times of encountering overdefined value for a single basic block is larger than lvi-overdefined-threshold, the LVI algorithm will stop further lowering the lattice value. llvm-svn: 218231
* ms-inline-asm: Add a sema callback for looking up label namesEhsan Akhgari2014-09-222-8/+36
| | | | | | | | | | | | | | | The implementation of the callback in clang's Sema will return an internal name for labels. Test Plan: Will be tested in clang. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4587 llvm-svn: 218229
* [x86] Back out a bad choice about lowering v4i64 and pave the way forChandler Carruth2014-09-221-44/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a more sane approach to AVX2 support. Fundamentally, there is no useful way to lower integer vectors in AVX. None. We always end up with a VINSERTF128 in the end, so we might as well eagerly switch to the floating point domain and do everything there. This cleans up lots of weird and unlikely to be correct differences between integer and floating point shuffles when we only have AVX1. The other nice consequence is that by doing things this way we will make it much easier to write the integer lowering routines as we won't need to duplicate the logic to check for AVX vs. AVX2 in each one -- if we actually try to lower a 256-bit vector as an integer vector, we have AVX2 and can rely on it. I think this will make the code much simpler and more comprehensible. Currently, I've disabled *all* support for AVX2 so that we always fall back to AVX. This keeps everything working rather than asserting. That will go away with the subsequent series of patches that provide a baseline AVX2 implementation. Please note, I'm going to implement AVX2 *without access to hardware*. That means I cannot correctness test this path. I will be relying on those with access to AVX2 hardware to do correctness testing and fix bugs here, but as a courtesy I'm trying to sketch out the framework for the new-style vector shuffle lowering in the context of the AVX2 ISA. llvm-svn: 218228
* [x86] Teach the new vector shuffle lowering how to cleverly lower singleChandler Carruth2014-09-211-3/+22
| | | | | | | | | | | | | input v8f32 shuffles which are not 128-bit lane crossing but have different shuffle patterns in the low and high lanes. This removes most of the extract/insert traffic that was unnecessary and is particularly good at lowering cases where only one of the two lanes is shuffled at all. I've also added a collection of test cases with undef lanes because this lowering is somewhat more sensitive to undef lanes than others. llvm-svn: 218226
* Fix typoMatt Arsenault2014-09-211-4/+3
| | | | llvm-svn: 218223
* Use llvm_unreachable instead of assert(!)Matt Arsenault2014-09-211-2/+2
| | | | llvm-svn: 218222
* R600/SI: Don't use strings for single charactersMatt Arsenault2014-09-211-18/+18
| | | | llvm-svn: 218221
OpenPOWER on IntegriCloud