summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* R600/SI: Make argument loads invariantMatt Arsenault2014-07-281-9/+17
| | | | llvm-svn: 214101
* [SKX] Enabling mask logic instructions: encoding, loweringRobert Khasanov2014-07-281-12/+19
| | | | | | | | Instructions: KAND{BWDQ}, KANDN{BWDQ}, KOR{BWDQ}, KXOR{BWDQ}, KXNOR{BWDQ} Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 214081
* [PowerPC] Support ELFv1/ELFv2 ABI selection via featuresUlrich Weigand2014-07-283-4/+28
| | | | | | | | | | | | | | | | | | | | While LLVM now supports both ELFv1 and ELFv2 ABIs, their use is currently hard-coded via the target triple: powerpc64-linux is always ELFv1, while powerpc64le-linux is always ELFv2. These are of course the most common scenarios, but in principle it is possible to support the ELFv2 ABI on big-endian or the ELFv1 ABI on little-endian systems (and GCC does support that), and there are some special use cases for that (e.g. certain Linux kernel versions could only be built using ELFv1 on LE). This patch implements the LLVM side of supporting this. As precedent on other platforms suggests, ABI options are passed to the back-end as features. Thus, this patch implements two features "elfv1" and "elfv2" that select the desired ABI if present. (If not, the LLVM uses the same default rules as now.) llvm-svn: 214072
* ARM: correct handling of features in arch_extensionSaleem Abdulrasool2014-07-271-11/+12
| | | | | | | | | | | | | | | | | | | | | | The subtarget information is the ultimate source of truth for the feature set that is enabled at this point. We would previously not propagate the feature information to the subtarget. While this worked for the most part (features would be enabled/disabled as requested), if another operation that changed the feature bits was encountered (such as a mode switch via a .arm or .thumb directive), we would end up resetting the behaviour of the architectural extensions. Handling this properly requires a slightly more complicated handling. We need to check if the feature is now being toggled. If so, only then do we toggle the features. In return, we no longer have to calculate the feature bits ourselves. The test changes are mostly to the diagnosis, which is now more uniform (a nice side effect!). Add an additional test to ensure that we handle this case properly. Thanks to Nico Weber for alerting me to this issue! llvm-svn: 214057
* ARM: convert loop to range basedSaleem Abdulrasool2014-07-271-14/+14
| | | | | | | Convert a loop to use range based iteration. Rename structure members to help naming, and make structure definition anonymous. NFC. llvm-svn: 214056
* Add alignment value to allowsUnalignedMemoryAccessMatt Arsenault2014-07-2721-63/+91
| | | | | | | | | | Rename to allowsMisalignedMemoryAccess. On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment, and don't need to be split into multiple accesses. Vector loads with an alignment of the element type are not uncommon in OpenCL code. llvm-svn: 214055
* AArch64: fix conversion of 'J' inline asm constraints.Tim Northover2014-07-271-1/+3
| | | | | | | | | | | 'J' represents a negative number suitable for an add/sub alias instruction, but while preparing it to become an int64_t we were mangling the sign extension. So "i32 -1" became 0xffffffffLL, for example. Should fix one half of PR20456. llvm-svn: 214052
* [x86] Sink a variable only used by asserts into the asserts. Should fixChandler Carruth2014-07-271-3/+3
| | | | | | some -Werror bots, sorry for the noise. llvm-svn: 214043
* [x86] Add a much more powerful framework for combining x86 shuffleChandler Carruth2014-07-271-0/+270
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | instructions in the legalized DAG, and leverage it to combine long sequences of instructions to PSHUFB. Eventually, the other x86-instruction-specific shuffle combines will probably all be driven out of this routine. But the real motivation is to detect after we have fully legalized and optimized a shuffle to the minimal number of x86 instructions whether it is profitable to replace the chain with a fully generic PSHUFB instruction even though doing so requires either a load from a constant pool or tying up a register with the mask. While the Intel manuals claim it should be used when it replaces 5 or more instructions (!!!!) my experience is that it is actually very fast on modern chips, and so I've gon with a much more aggressive model of replacing any sequence of 3 or more instructions. I've also taught it to do some basic canonicalization to special-purpose instructions which have smaller encodings than their generic counterparts. There are still quite a few FIXMEs here, and I've not yet implemented support for lowering blends with PSHUFB (where its power really shines due to being able to zero out lanes), but this starts implementing real PSHUFB support even when using the new, fancy shuffle lowering. =] llvm-svn: 214042
* R600: Move intrinsic lowering to separate functionsMatt Arsenault2014-07-262-109/+126
| | | | llvm-svn: 214023
* [SDAG] Add an assert that we don't mess up the number of values whenChandler Carruth2014-07-261-0/+3
| | | | | | | | replacing nodes in the legalizer. This caught a number of bugs for me during development. llvm-svn: 214022
* [SDAG] Simplify the code for handling single-value nodes and addChandler Carruth2014-07-261-8/+12
| | | | | | a missing transfer of debug information (without which tests fail). llvm-svn: 214021
* [SDAG] When performing post-legalize DAG combining, run the legalizerChandler Carruth2014-07-262-61/+107
| | | | | | | | | | | | | | | | | | | | | | over each node in the worklist prior to combining. This allows the combiner to produce new nodes which need to go back through legalization. This is particularly useful when generating operands to target specific nodes in a post-legalize DAG combine where the operands are significantly easier to express as pre-legalized operations. My immediate use case will be PSHUFB formation where we need to build a constant shuffle mask with a build_vector node. This also refactors the relevant functionality in the legalizer to support this, and updates relevant tests. I've spoken to the R600 folks and these changes look like improvements to them. The avx512 change needs to be investigated, I suspect there is a disagreement between the legalizer and the DAG combiner there, but it seems a minor issue so leaving it to be re-evaluated after this patch. Differential Revision: http://reviews.llvm.org/D4564 llvm-svn: 214020
* Fix broken assert.Nick Lewycky2014-07-261-1/+1
| | | | llvm-svn: 214019
* X86ShuffleDecode.cpp: Silence a warning. [-Wunused-variable]NAKAMURA Takumi2014-07-261-2/+2
| | | | llvm-svn: 214016
* [x86] Fix PR20355 (for real). There are many layers to this bug.Chandler Carruth2014-07-261-20/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tale starts with r212808 which attempted to fix inversion of the low and high bits when lowering MUL_LOHI. Sadly, that commit did not include any positive test cases, and just removed some operations from a test case where the actual logic being changed isn't fully visible from the test. What this commit did was two things. First, it reversed the low and high results in the formation of the MERGE_VALUES node for the multiple results. This is entirely correct. Second it changed the shuffles for extracting the low and high components from the i64 results of the multiplies to extract them assuming a big-endian-style encoding of the multiply results. This second change is wrong. There is no big-endian encoding in x86, the results of the multiplies are normal v2i64s: when cast to v4i32, the low i32s are at offsets 0 and 2, and the high i32s are at offsets 1 and 3. However, the first change wasn't enough to actually fix the bug, which is (I assume) why the second change was also made. There was another bug in the MERGE_VALUES formation: we weren't using a VTList, and so were getting a single result node! When grabbing the *second* result from the node, we got... well.. colud be anything. I think this *appeared* to invert things, but had to be causing other problems as well. Fortunately, I fixed the MERGE_VALUES issue in r213931, so we should have been fine, right? NOOOPE! Because the core bug was never addressed, the test in vector-idiv failed when I fixed the MERGE_VALUES node. Because there are essentially no docs for this node, I had to guess at how to fix it and tried swapping the operands, restoring the order of the original code before r212808. While this "fixed" the test case (in that we produced the write instructions) we were still extracting the wrong elements of the i64s, and thus PR20355 was still broken. This commit essentially reverts the big-endian-style extraction part of r212808 and goes back to the original masks which were correct. Now that the MERGE_VALUES node formation is also correct, everything works. I've also included a more detailed test from PR20355 to make sure this stays fixed. llvm-svn: 214011
* [x86] Revert r214007: Fix PR20355 ...Chandler Carruth2014-07-261-5/+3
| | | | | | | | | | | The clever way to implement signed multiplication with unsigned *is already implemented* and tested and working correctly. The bug is somewhere else. Re-investigating. This will teach me to not scroll far enough to read the code that did what I thought needed to be done. llvm-svn: 214009
* [x86] Fix PR20355 (and dups) by not using unsigned multiplication whenChandler Carruth2014-07-261-3/+5
| | | | | | | | | | | | | | signed multiplication is requested. While there is not a difference in the *low* half of the result, the *high* half (used specifically to implement the signed division by these constants) certainly is used. The test case I've nuked was actively asserting wrong code. There is a delightful solution to doing signed multiplication even when we don't have it that Richard Smith has crafted, but I'll add the machinery back and implement that in a follow-up patch. This at least restores correctness. llvm-svn: 214007
* Update X86/Utils/LLVMBuild.txt corresponding to r213986. "Core" has been ↵NAKAMURA Takumi2014-07-261-1/+1
| | | | | | introduced. llvm-svn: 213995
* [x86] Fix unused variable warning in no-asserts build.Chandler Carruth2014-07-261-0/+1
| | | | llvm-svn: 213989
* [x86] Teach the X86 backend to print shuffle comments for PSHUFBChandler Carruth2014-07-253-0/+116
| | | | | | | | | | | | | | | | | | | | instructions which happen to have a constant mask. Currently, this only handles a very narrow set of cases, but those happen to be the cases that I care about for testing shuffles sanely. This is a bit trickier than other shuffle instructions because we're decoding constants out of the constant pool. The current MC layer makes it completely impossible to inspect a constant pool entry, so we have to do it at the MI level and attach the comment to the streamer on its way out. So no joy for disassembling, but it does make test cases and asm dumps *much* nicer. Sorry for no test cases, but it didn't really seem that valuable to go trolling through existing old test cases and updating them. I'll have lots of testing of this in the upcoming patch for SSSE3 emission in the new vector shuffle lowering code paths. llvm-svn: 213986
* R600/SI: Allow partial unrolling and increase thresholds.Matt Arsenault2014-07-251-1/+7
| | | | llvm-svn: 213985
* Move R600 subtarget dependent variables onto the subtarget.Eric Christopher2014-07-256-83/+83
| | | | | | No functional change. llvm-svn: 213982
* coverage: remove empty mapping regionsAlex Lorenz2014-07-252-8/+0
| | | | | | | | This patch removes the empty coverage mapping regions. Those regions were produced by clang's old mapping region generation algorithm, but the new algorithm doesn't generate them. llvm-svn: 213981
* Canonicalization for @llvm.assumeHal Finkel2014-07-251-0/+17
| | | | | | | | | Adds simple logical canonicalization of assumption intrinsics to instcombine, currently: - invariant(a && b) -> invariant(a); invariant(b) - invariant(!(a || b)) -> invariant(!a); invariant(!b) llvm-svn: 213977
* Wrap to 80 columns, no behavior change.Nico Weber2014-07-251-2/+4
| | | | llvm-svn: 213975
* Add @llvm.assume, lowering, and some basic propertiesHal Finkel2014-07-256-7/+63
| | | | | | | | | | | | | | | | | This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here: - llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef). The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion. llvm-svn: 213973
* [stack protector] Fix a potential security bug in stack protector where theAkira Hatanaka2014-07-2519-7/+237
| | | | | | | | | | | | | | address of the stack guard was being spilled to the stack. Previously the address of the stack guard would get spilled to the stack if it was impossible to keep it in a register. This patch introduces a new target independent node and pseudo instruction which gets expanded post-RA to a sequence of instructions that load the stack guard value. Register allocator can now just remat the value when it can't keep it in a register. <rdar://problem/12475629> llvm-svn: 213967
* Remove dead code.Rafael Espindola2014-07-251-28/+0
| | | | llvm-svn: 213963
* [PowerPC] Support TLS on PPC32/ELFHal Finkel2014-07-257-62/+211
| | | | | | Patch by Justin Hibbits! llvm-svn: 213960
* [FastISel][AArch64] Add support for frameaddress intrinsic.Juergen Ributzka2014-07-251-2/+28
| | | | | | | | | | | | This commit implements the frameaddress intrinsic for the AArch64 architecture in FastISel. There were two test cases that pretty much tested the same, so I combined them to a single test case. Fixes <rdar://problem/17811834> llvm-svn: 213959
* Move -verify-use-list-order into llvm-uselistorderDuncan P. N. Exon Smith2014-07-253-372/+0
| | | | | | | | | | | | | | | | | Ugh. Turns out not even transformation passes link in how to read IR. I sincerely believe the buildbots will finally agree with my system after this though. (I don't really understand why all of this has been working on my system, but not on all the buildbots.) Create a new tool called llvm-uselistorder to use for verifying use-list order. For now, just dump everything from the (now defunct) -verify-use-list-order pass into the tool. This might be a better way to test use-list order anyway. Part of PR5680. llvm-svn: 213957
* Reapply "DebugInfo: Don't put fission type units in comdat sections."David Blaikie2014-07-254-19/+26
| | | | | | | | | | | | | | | | This recommits r208930, r208933, and r208975 (by reverting r209338) and reverts r209529 (the FIXME to readd this functionality once the tools were fixed) now that DWP has been fixed to cope with a single section for all fission type units. Original commit message: "Since type units in the dwo file are handled by a debug aware tool, they don't need to leverage the ELF comdat grouping to implement deduplication. Avoid creating all the .group sections for these as a space optimization." llvm-svn: 213956
* Fix MSVC2012 build error in UseListOrder.cppHans Wennborg2014-07-251-3/+4
| | | | | | | | | I think the compiler got confused by the nested DEBUG macros. It was failing with: UseListOrder.cpp(80) : error C2059: syntax error : '}' llvm-svn: 213954
* Bitcode: Don't optimize constants when preserving use-list orderDuncan P. N. Exon Smith2014-07-251-0/+6
| | | | | | | | | | | | | | | | | | | | `ValueEnumerator::OptimizeConstants()` creates forward references within the constant pools, which makes predicting constants' use-list order difficult. For now, just disable the optimization. This can be re-enabled in the future in one of two ways: - Enable a limited version of this optimization that doesn't create forward references. One idea is to categorize constants by their "height" and make that the top-level sort. - Enable it entirely. This requires predicting how may times each constant will be recreated as its operands' and operands' operands' (etc.) forward references get resolved. This is part of PR5680. llvm-svn: 213953
* Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for ↵David Blaikie2014-07-254-4/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | functions that do not have top level debug information. Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped provide a reduced reproduction, though the failure wasn't too hard to guess, and even easier with the example to confirm. The assertion that the subprogram metadata associated with an llvm::Function matches the scope data referenced by the DbgLocs on the instructions in that function is not valid under LTO. In LTO, a C++ inline function might exist in multiple CUs and the subprogram metadata nodes will refer to the same llvm::Function. In this case, depending on the order of the CUs, the first intance of the subprogram metadata may not be the one referenced by the instructions in that function and the assertion will fail. A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the assertion removed and a comment added to explain this situation. This was then reverted again in r213581 as it caused PR20367. The root cause of this was the early exit in LiveDebugVariables meant that spurious DBG_VALUE intrinsics that referenced dead variables were not removed, causing an assertion/crash later on. The fix is to have LiveDebugVariables strip all DBG_VALUE intrinsics in functions without debug info as they're not needed anyway. Test case added to cover this situation (that occurs when a debug-having function is inlined into a nodebug function) in test/DebugInfo/X86/nodebug_with_debug_loc.ll Original commit message: If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 213952
* Convert noalias parameter attributes into noalias metadata during inliningHal Finkel2014-07-251-0/+174
| | | | | | | | | | | | | | | | | | | | | | | This functionality is currently turned off by default. Part of the motivation for introducing scoped-noalias metadata is to enable the preservation of noalias parameter attribute information after inlining. Sometimes this can be inferred from the code in the caller after inlining, but often we simply lose valuable information. The overall process if fairly simple: 1. Create a new unqiue scope domain. 2. For each (used) noalias parameter, create a new alias scope. 3. For each pointer, collect the underlying objects. Add a noalias scope for each noalias parameter from which we're not derived (and has not been captured prior to that point). 4. Add an alias.scope for each noalias parameter from which we might be derived (or has been captured before that point). Note that the capture checks apply only if one of the underlying objects is not an identified function-local object. llvm-svn: 213949
* Simplify and improve scoped-noalias metadata semanticsHal Finkel2014-07-252-52/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the process of fixing the noalias parameter -> metadata conversion process that will take place during inlining (which will be committed soon, but not turned on by default), I have come to realize that the semantics provided by yesterday's commit are not really what we want. Here's why: void foo(noalias a, noalias b, noalias c, bool x) { *q = x ? a : b; *c = *q; } Generically, we know that *c does not alias with *a and with *b (so there is an 'and' in what we know we're not), and we know that *q might be derived from *a or from *b (so there is an 'or' in what we know that we are). So we do not want the semantics currently, where any noalias scope matching any alias.scope causes a NoAlias return. What we want to know is that the noalias scopes form a superset of the alias.scope list (meaning that all the things we know we're not is a superset of all of things the other instruction might be). Making that change, however, introduces a composibility problem. If we inline once, adding the noalias metadata, and then inline again adding more, and we append new scopes onto the noalias and alias.scope lists each time. But, this means that we could change what was a NoAlias result previously into a MayAlias result because we appended an additional scope onto one of the alias.scope lists. So, instead of giving scopes the ability to have parents (which I had borrowed from the TBAA implementation, but seems increasingly unlikely to be useful in practice), I've given them domains. The subset/superset condition now applies within each domain independently, and we only need it to hold in one domain. Each time we inline, we add the new scopes in a new scope domain, and everything now composes nicely. In addition, this simplifies the implementation. llvm-svn: 213948
* Try to fix a layering violation introduced by r213945Duncan P. N. Exon Smith2014-07-252-303/+318
| | | | | | | | | | | The dragonegg buildbot (and others?) started failing after r213945/r213946 because `llvm-as` wasn't linking in the bitcode reader. I think moving the verify functions to the same file as the verify pass should fix the build. Adding a command-line option for maintaining use-list order in assembly as a drive-by to prevent warnings about unused static functions. llvm-svn: 213947
* Fix -Werror build after r213945Duncan P. N. Exon Smith2014-07-251-0/+1
| | | | llvm-svn: 213946
* IPO: Add use-list-order verifierDuncan P. N. Exon Smith2014-07-256-7/+489
| | | | | | | | | | | | | | | | | | | | Add a -verify-use-list-order pass, which shuffles use-list order, writes to bitcode, reads back, and verifies that the (shuffled) order matches. - The utility functions live in lib/IR/UseListOrder.cpp. - Moved (and renamed) the command-line option to enable writing use-lists, so that this pass can return early if the use-list orders aren't being serialized. It's not clear that this pass is the right direction long-term (perhaps a separate tool instead?), but short-term it's a great way to test the use-list order prototype. I've added an XFAIL-ed testcase that I'm hoping to get working pretty quickly. This is part of PR5680. llvm-svn: 213945
* [ARM] Emit ABI_PCS_R9_use build attribute.Amara Emerson2014-07-251-0/+11
| | | | | | | | Patch by Ben Foster! Differential Revision: http://reviews.llvm.org/D4657 llvm-svn: 213944
* Run sort_includes.py on the AArch64 backend.Benjamin Kramer2014-07-2520-43/+43
| | | | | | No functionality change. llvm-svn: 213938
* [SDAG] Enable the new assert for out-of-range result numbers inChandler Carruth2014-07-252-5/+6
| | | | | | | | | | | | | | | | SDValues, fixing the two bugs left in the regression suite. The key for both of these was the use a single value type rather than a VTList which caused an unintentionally single-result merge-value node. Fix this by getting the appropriate VTList in place. Doing this exposed that the comments in x86's code abouth how MUL_LOHI operands are handle is wrong. The bug with the use of out-of-range result numbers was hiding the bug about the order of operands here (as best i can tell). There are more places where the code appears to get this backwards still... llvm-svn: 213931
* [SDAG] Don't insert the VRBase into a mapping from SDValues when the defChandler Carruth2014-07-251-6/+10
| | | | | | | doesn't actually correspond to an SDValue at all. Fixes most of the remaining asserts on out-of-range SDValue result numbers. llvm-svn: 213930
* Store nodes only have 1 result.Matt Arsenault2014-07-251-1/+1
| | | | llvm-svn: 213928
* [SDAG] Start plumbing an assert into SDValues that we don't form oneChandler Carruth2014-07-251-1/+1
| | | | | | | | | | | | | with a result number outside the range of results for the node. I don't know how we managed to not really check this very basic invariant for so long, but the code is *very* broken at this point. I have over 270 test failures with the assert enabled. I'm committing it disabled so that others can join in the cleanup effort and reproduce the issues. I've also included one of the obvious fixes that I already found. More fixes to come. llvm-svn: 213926
* [ARM] In thumb mode, emit directive ".code 16" before file level inlineAkira Hatanaka2014-07-251-0/+3
| | | | | | | | | | | | assembly instructions. This is necessary to ensure ARM assembler switches to Thumb mode before it starts assembling the file level inline assembly instructions at the beginning of a .s file. <rdar://problem/17757232> llvm-svn: 213924
* Fix a warning in CoverageMappingReader.cppEhsan Akhgari2014-07-251-1/+1
| | | | llvm-svn: 213920
* [X86] Clarify some stackmap shadow optimization code as based on reviewLang Hames2014-07-252-7/+19
| | | | | | | | feedback from Eric Christopher. No functional change. llvm-svn: 213917
OpenPOWER on IntegriCloud