summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Add more rules to combine shuffle vector dag nodes.Andrea Di Biagio2014-07-141-0/+44
| | | | | | | | | | | | | | | | This patch teaches the DAGCombiner how to fold a pair of shuffles according to rules: 1. shuffle(shuffle A, B, M0), B, M1) -> shuffle(A, B, M2) 2. shuffle(shuffle A, B, M0), A, M1) -> shuffle(A, B, M3) The new rules would only trigger if the resulting shuffle has legal type and legal mask. Added test 'combine-vec-shuffle-3.ll' to verify that DAGCombiner correctly folds shuffles on x86 when the resulting mask is legal. Also added some negative cases to verify that we avoid introducing illegal shuffles. llvm-svn: 213001
* CodeGen: Add a getSectionKind method to MachineConstantPoolEntryDavid Majnemer2014-07-142-15/+32
| | | | | | This is just a helper routine, no functionality has changed. llvm-svn: 212993
* Unify the lowering of arguments during SjLj prepare.Bill Wendling2014-07-141-28/+10
| | | | | | | The 'select true, %arg, undef' instruction can be used for both aggregate and non-aggregate arguments. llvm-svn: 212967
* fixed typoSanjay Patel2014-07-141-1/+1
| | | | llvm-svn: 212966
* CodeGen: add missing includeSaleem Abdulrasool2014-07-141-0/+1
| | | | | | | Found during windows unwinding work. This header is indirectly included through a chain leading through Support/Win64EH.h. Explicitly include the header. NFC. llvm-svn: 212955
* Support lowering of empty aggregates.Bill Wendling2014-07-141-11/+11
| | | | | | | | | | | | | This crash was pretty common while compiling Rust for iOS (armv7). Reason - SjLj preparation step was lowering aggregate arguments as ExtractValue + InsertValue. ExtractValue has assertion which checks that there is some data in value, which is not true in case of empty (no fields) structures. Rust uses them quite extensively so this patch uses a 'select true, %val, undef' instruction to lower the argument. Patch by Valerii Hiora. llvm-svn: 212922
* [DAGCombiner] Fix a crash caused by a missing check for legal type when ↵Andrea Di Biagio2014-07-131-1/+1
| | | | | | | | | | | | | | | | | | | | trying to fold shuffles. Verify that DAGCombiner does not crash when trying to fold a pair of shuffles according to rule (added at r212539): (shuffle (shuffle A, Undef, M0), Undef, M1) -> (shuffle A, Undef, M2) The DAGCombiner avoids folding shuffles if the resulting shuffle dag node is not legal for the target. That means, the resulting shuffle must have legal type and legal mask. Before, the DAGCombiner only called method 'TargetLowering::isShuffleMaskLegal' to check if it was "safe" to fold according to the above-mentioned rule. However, this caused a crash in the x86 backend since method 'isShuffleMaskLegal' always expects to be called on a legal vector type. llvm-svn: 212915
* Templatify DominanceFrontier.Matt Arsenault2014-07-122-0/+55
| | | | | | Theoretically this should now work for MachineBasicBlocks. llvm-svn: 212885
* Avoid a warning from MSVC on "*/" in this code by inserting a spaceReid Kleckner2014-07-121-1/+1
| | | | llvm-svn: 212862
* [FastISel] Add target-independent patchpoint intrinsic support. WIP.Juergen Ributzka2014-07-111-0/+169
| | | | | | | | | | This implements the target-independent lowering for the patchpoint intrinsic. Targets have to implement the FastLowerCall hook to support this intrinsic. Related to <rdar://problem/17427052> llvm-svn: 212849
* [FastISel] Add basic infrastructure to support a target-independent call ↵Juergen Ributzka2014-07-111-2/+208
| | | | | | | | | | | | | | | lowering hook in FastISel. WIP The infrastructure mimics the call lowering we have already in place for SelectionDAG, but with limitations. For example structure return demotion and non-simple types are not supported (yet). Currently every backend has its own implementation and duplicated code for call lowering. There is also no specified interface that could be called from target-independent code. The target-hook is opt-in and doesn't affect current implementations. llvm-svn: 212848
* [FastISel] Make isInTailCallPosition independent of SelectionDAG.Juergen Ributzka2014-07-112-6/+5
| | | | | | | Break out the arguemnts required from SelectionDAG, so that this function can also be used by FastISel. llvm-svn: 212844
* [FastISel] Breakout intrinsic lowering into a separate function and add a ↵Juergen Ributzka2014-07-111-34/+39
| | | | | | | | | | | target-hook. Create a separate helper function for target-independent intrinsic lowering. Also add an target-hook that allows to directly call into a target-sepcific intrinsic lowering method. Currently the implementation is opt-in and doesn't affect existing target implementations. llvm-svn: 212843
* ARM: Allow __fp16 as a function arg or return type for AArch64Oliver Stannard2014-07-111-1/+1
| | | | | | | ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. llvm-svn: 212812
* Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from ↵David Blaikie2014-07-112-8/+4
| | | | | | | | | | | | instructions within a function, lead to the function itself."" This reverts commit r212776. Nope, still seems to be failing on the sanitizer bots... but hey, not the msan self-host anymore, it's failing in asan now. I'll start looking there next. llvm-svn: 212793
* Reapply "DebugInfo: Ensure that all debug location scope chains from ↵David Blaikie2014-07-102-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instructions within a function, lead to the function itself." Committed in r212205 and reverted in r212226 due to msan self-hosting failure, I believe I've got that fixed by r212761 to Clang. Original commit message: "Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions." llvm-svn: 212776
* SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizerJan Vesely2014-07-102-58/+65
| | | | | | | | | | | Move the code to a helper function to allow calls from TypeLegalizer. No functionality change intended Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Owen Anderson <resistor@mac.com> llvm-svn: 212772
* Revert "Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), ↵Matt Arsenault2014-07-101-0/+13
| | | | | | | | (trunc b) combine."" Don't try to convert the select condition type. llvm-svn: 212750
* [DAG] Further improve the logic in DAGCombiner that folds a pair of shuffles ↵Andrea Di Biagio2014-07-101-14/+51
| | | | | | | | | | | | | | | | | | | | | into a single shuffle if the resulting mask is legal. This patch teaches the DAGCombiner how to fold shuffles according to the following new rules: 1. shuffle(shuffle(x, y), undef) -> x 2. shuffle(shuffle(x, y), undef) -> y 3. shuffle(shuffle(x, y), undef) -> shuffle(x, undef) 4. shuffle(shuffle(x, y), undef) -> shuffle(y, undef) The backend avoids to combine shuffles according to rules 3. and 4. if the resulting shuffle does not have a legal mask. This is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes during vector legalization. Added test case combine-vec-shuffle-2.ll to verify that we correctly triggers the new rules when combining shuffles. llvm-svn: 212748
* [x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogousChandler Carruth2014-07-106-9/+113
| | | | | | | | | | | | | | | | | | | | to the zero-extend-vector-inreg node introduced previously for the same purpose: manage the type legalization of widened extend operations, especially to support the experimental widening mode for x86. I'm adding both because sign-extend is expanded in terms of any-extend with shifts to propagate the sign bit. This removes the last fundamental scalarization from vec_cast2.ll (a test case that hit many really bad edge cases for widening legalization), although the trunc tests in that file still appear scalarized because the the shuffle legalization is scalarizing. Funny thing, I've been working on that. Some initial experiments with this and SSE2 scenarios is showing moderately good behavior already for sign extension. Still some work to do on the shuffle combining on X86 before we're generating optimal sequences, but avoiding scalarization is a huge step forward. llvm-svn: 212714
* Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) ↵NAKAMURA Takumi2014-07-101-14/+0
| | | | | | | | combine." This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations. llvm-svn: 212708
* Make it possible for ints/floats to return different values from ↵Daniel Sanders2014-07-1010-50/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697
* [AArch64]Fix an assertion failure in DAG Combiner about concating 2 ↵Hao Liu2014-07-101-4/+18
| | | | | | build_vector. llvm-svn: 212677
* [SDAG] Make the new zext-vector-inreg node default to expand so targetsChandler Carruth2014-07-091-1/+4
| | | | | | | | | | | don't need to set it manually. This is based on feedback from Tom who pointed out that if every target needs to handle this we need to reach out to those maintainers. In fact, it doesn't make sense to duplicate everything when anything other than expand seems unlikely at this stage. llvm-svn: 212661
* Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for ↵David Blaikie2014-07-094-4/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | functions that do not have top level debug information. Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped provide a reduced reproduction, though the failure wasn't too hard to guess, and even easier with the example to confirm. The assertion that the subprogram metadata associated with an llvm::Function matches the scope data referenced by the DbgLocs on the instructions in that function is not valid under LTO. In LTO, a C++ inline function might exist in multiple CUs and the subprogram metadata nodes will refer to the same llvm::Function. In this case, depending on the order of the CUs, the first intance of the subprogram metadata may not be the one referenced by the instructions in that function and the assertion will fail. A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the assertion removed and a comment added to explain this situation. Original commit message: If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212649
* Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.Matt Arsenault2014-07-091-0/+14
| | | | | | Do this if the truncate is free and the select is legal. llvm-svn: 212640
* [x86] Fix a bug in my new zext-vector-inreg DAG trickery where we wereChandler Carruth2014-07-092-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | not widening the input type to the node sufficiently to let the ext take place in a register. This would in turn result in a mysterious bitcast assertion failure downstream. First change here is to add back the helpful assert I had in an earlier version of the code to catch this immediately. Next change is to add support to the type legalization to detect when we have widened the operand either too little or too much (for whatever reason) and find a size-matched legal vector type to convert it to first. This can also fail so we get a new fallback path, but that seems OK. With this, we no longer crash on vec_cast2.ll when using widening. I've also added the CHECK lines for the zero-extend cases here. We still need to support sign-extend and trunc (or something) to get plausible code for the other two thirds of this test which is one of the regression tests that showed the most scalarization when widening was force-enabled. Slowly closing in on widening being a viable legalization strategy without it resorting to scalarization at every turn. =] llvm-svn: 212614
* Sink two variables only used in an assert into the assert itself. ShouldChandler Carruth2014-07-091-3/+3
| | | | | | fix the release builds with Werror. llvm-svn: 212612
* [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when wideningChandler Carruth2014-07-095-1/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector types to be legal and a ZERO_EXTEND node is encountered. When we use widening to legalize vector types, extend nodes are a real challenge. Either the input or output is likely to be legal, but in many cases not both. As a consequence, we don't really have any way to represent this situation and the prior code in the widening legalization framework would just scalarize the extend operation completely. This patch introduces a new DAG node to represent doing a zero extend of a vector "in register". The core of the idea is to allow legal but different vector types in the input and output. The output vector must have fewer lanes but wider elements. The operation is defined to zero extend the low elements of the input to the size of the output elements, and drop all of the high elements which don't have a corresponding lane in the output vector. It also includes generic expansion of this node in terms of blending a zero vector into the high elements of the vector and bitcasting across. This in turn yields extremely nice code for x86 SSE2 when we use the new widening legalization logic in conjunction with the new shuffle lowering logic. There is still more to do here. We need to support sign extension, any extension, and potentially int-to-float conversions. My current plan is to continue using similar synthetic nodes to model each of these transitions with generic lowering code for each one. However, with this patch LLVM already reaches performance parity with GCC for the core C loops of the x264 code (assuming you disable the hand-written assembly versions) when compiling for SSE2 and SSE3 architectures and enabling the new widening and lowering logic for vectors. Differential Revision: http://reviews.llvm.org/D4405 llvm-svn: 212610
* [SDAG] At the suggestion of Hal, switch to an output parameter thatChandler Carruth2014-07-093-22/+27
| | | | | | | | | | tracks which elements of the build vector are in fact undef. This should make actually inpsecting them (likely in my next patch) reasonably pretty. Also makes the output parameter optional as it is clear now that *most* users are happy with undefs in their splats. llvm-svn: 212581
* [DAG] Teach how to combine a pair of shuffles into a single shuffle if the ↵Andrea Di Biagio2014-07-081-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | resulting mask is legal. This patch teaches how to fold a shuffle according to rule: shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2) We do this only if the resulting mask M2 is legal; this is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes. This patch has the advantage of being target independent, since it works on ISD nodes. Therefore, all targets (not only x86) can take advantage of this rule. The idea behind this patch is that most shuffle pairs can be safely combined before we run the legalizer on vector operations. This allows us to combine/simplify dag nodes earlier in the process and not only immediately before instruction selection stage. That said. This patch is not meant to replace any existing target specific combine rules; backends might still introduce new shuffles during legalization stage. Also, this rule is very simple and avoids to aggressively optimize shuffles. llvm-svn: 212539
* Fix some Twine locals.Benjamin Kramer2014-07-081-9/+7
| | | | | | Two of those are use after frees. Found by clang-tidy, fixed by me. llvm-svn: 212537
* [x86,SDAG] Sink the logic for folding shuffles of splats moreChandler Carruth2014-07-081-5/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | aggressively from the x86 shuffle lowering to the generic SDAG vector shuffle formation code. This code already tried to fold away shuffles of splats! It just had lots of bugs and couldn't handle the case my new x86 shuffle lowering needed. First, it failed to correctly compute whether N2 was undef because it pre-computed this, then did transformations which could *make* N2 undef, then failed to ever re-consider the precomputed state. Second, it didn't look through bitcasts at all, even in the safe cases where they are just element-type bitcasts with no change to the number of elements. Third, it didn't handle all-zero bit casts nicely the way my code in the x86 side of things did, which is essential to getting good zext-shuffle lowerings. But all of these are generic. I just ported the code down to this layer and fixed the surrounding bugs. Tests exercising this in the x86 backend still pass and some silly code in widen_cast-6.ll gets better. I updated that test to be a bit more precise but it's still pretty unclear what the value of the test is in this day and age. llvm-svn: 212517
* [SDAG] Actually check for a non-constant splat and clarify commentsChandler Carruth2014-07-081-4/+8
| | | | | | | | | | | | | | around the handling of UNDEF lanes in boolean vector content analysis. The code before my changes here also failed to check for non-constant splats in a buildvector. I have no idea how to trigger this, I just spotted by inspection when trying to understand the code. It seems extremely unlikely to be worth the trouble to teach the only caller of this code (DAG combining setcc patterns) how to cleverly handle undef lanes, so I've just commented more thoroughly that we're giving up there. llvm-svn: 212515
* [SDAG] Build up a more rich set of APIs for querying build-vector SDAGChandler Carruth2014-07-083-13/+47
| | | | | | | | | | | | | | | | | | | | nodes about whether they are splats. This is factored out and improved from r212324 which got reverted as it was far too aggressive. The new API should help more conservatively handle buildvectors that are a mixture of splatted and undef values. No functionality change at this point. The hope is to slowly re-introduce the undef-tolerant optimization of splats, but each time being forced to make a concious decision about how to handle the undefs in a way that doesn't lead to contradicting assumptions about the collapsed value. Hal has pointed out in discussions that this may not end up being the desired API and instead it may be more convenient to get a mask of the undef elements or something similar. I'm starting simple and will expand the API as I adapt actual callers and see exactly what they need. llvm-svn: 212514
* [x86] Revert r212324 which was too aggressive w.r.t. allowing undefChandler Carruth2014-07-073-44/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lanes in vector splats. The core problem here is that undef lanes can't *unilaterally* be considered to contribute to splats. Their handling needs to be more cautious. There is also a reported failure of the nightly testers (thanks Tobias!) that may well stem from the same core issue. I'm going to fix this theoretical issue, factor the APIs a bit better, and then verify that I don't see anything bad with Tobias's reduction from the test suite before recommitting. Original commit message for r212324: [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is *dramatically* improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212475
* Make helper functions static.Benjamin Kramer2014-07-071-1/+1
| | | | llvm-svn: 212460
* CodeGen: it turns out that NAND is not the same thing as BIC. At all.Tim Northover2014-07-071-1/+1
| | | | | | | | | | | We've been performing the wrong operation on ARM for "atomicrmw nand" for years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b". This bled over into the generic expansion pass. So I assume no-one has ever actually tried to do an atomic nand in the real world. Oh well. llvm-svn: 212443
* [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work forChandler Carruth2014-07-043-31/+44
| | | | | | | | | | | | | | | | | | | | | | | | any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is *dramatically* improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212324
* Move function dependent resetting of a subtarget variable out of theEric Christopher2014-07-041-0/+1
| | | | | | | | | | subtarget. This involved having the movt predicate take the current function - since we care about size in instruction selection for whether or not to use movw/movt take the function so we can check the attributes. This required adding the current MachineFunction to FastISel and propagating through. llvm-svn: 212309
* Temporarily revert "Don't try to construct debug LexicalScopes hierarchy for ↵Eric Christopher2014-07-034-26/+4
| | | | | | | | functions that do not have top level debug information." as it appears to be breaking some LTO constructs. This reverts commit r212203. llvm-svn: 212298
* bug fix for PR20020: anti-dependency-breaker causes miscompilationSanjay Patel2014-07-031-0/+31
| | | | | | | | | | | | | This patch sets the 'KeepReg' bit for any tied and live registers during the PrescanInstruction() phase of the dependency breaking algorithm. It then checks those 'KeepReg' bits during the ScanInstruction() phase to avoid changing any tied registers. For more details, please see comments in: http://llvm.org/bugs/show_bug.cgi?id=20020 I added two FIXME comments for code that I think can be removed by using register iterators that include self. I don't want to include those code changes with this patch, however, to keep things as small as possible. The test case is larger than I'd like, but I don't know how to reduce it further and still produce the failing asm. Differential Revision: http://reviews.llvm.org/D4351 llvm-svn: 212275
* Fix ppcf128 component access on little-endian systemsUlrich Weigand2014-07-034-11/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a pair of two doubles, where one is considered the "high" or more-significant part, and the other is considered the "low" or less-significant part. When a ppcf128 value is stored in memory or a register pair, the high part always comes first, i.e. at the lower memory address or in the lower-numbered register, and the low part always comes second. This is true both on big-endian and little-endian PowerPC systems. (Similar to how with a complex number, the real part always comes first and the imaginary part second, no matter the byte order of the system.) This was implemented incorrectly for little-endian systems in LLVM. This commit fixes three related issues: - When printing an immediate ppcf128 constant to assembler output in emitGlobalConstantFP, emit the high part first on both big- and little-endian systems. - When lowering a ppcf128 type to a pair of f64 types in SelectionDAG (which is used e.g. when generating code to load an argument into a register pair), use correct low/high part ordering on little-endian systems. - In a related issue, because lowering ppcf128 into a pair of f64 must operate differently from lowering an int128 into a pair of i64, bitcasts between ppcf128 and int128 must not be optimized away by the DAG combiner on little-endian systems, but must effect a word-swap. Reviewed by Hal Finkel. llvm-svn: 212274
* [x86] Fix the completely broken vector widening legalization of bswap.Chandler Carruth2014-07-031-1/+1
| | | | | | | | | | | | This operation was classified as a binary operation in the widening logic for some reason (clearly, untested). It is in fact a unary operation. Add a RUN line to a test to exercise this for x86. Note that again the vector widening strategy doesn't regress anything and in one case removes a totally unecessary instruction that we couldn't avoid when promoting the element type. llvm-svn: 212257
* [codegen,aarch64] Add a target hook to the code generator to controlChandler Carruth2014-07-031-40/+48
| | | | | | | | | | | | | | | | | | | | | vector type legalization strategies in a more fine grained manner, and change the legalization of several v1iN types and v1f32 to be widening rather than scalarization on AArch64. This fixes an assertion failure caused by scalarizing nodes like "v1i32 trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32. This also provides a foundation for other targets to have more granular control over how vector types are legalized. Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow some work to start taking place on top of this patch as it adds some really important hooks to the backend that I'd like to immediately start using. =] http://reviews.llvm.org/D4322 llvm-svn: 212242
* Revert "DebugInfo: Ensure that all debug location scope chains from ↵David Blaikie2014-07-022-8/+4
| | | | | | | | | | | | instructions within a function, lead to the function itself." This reverts commit r212205. Reverting this again, still seeing crashes when building compiler-rt... Sorry for the continued noise, not sure why I'm failing to reproduce this locally. llvm-svn: 212226
* DebugInfo: Ensure that all debug location scope chains from instructions ↵David Blaikie2014-07-022-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | within a function, lead to the function itself. Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 212205
* [RegAllocGreedy] Provide a subtarget hook to disable the local reassignmentQuentin Colombet2014-07-021-4/+14
| | | | | | | | | | | | heuristic. By default, no functionality change. This is a follow-up of r212099. This hook provides a finer grain to control the optimization. <rdar://problem/17444599> llvm-svn: 212204
* Don't try to construct debug LexicalScopes hierarchy for functions that do ↵David Blaikie2014-07-024-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | not have top level debug information. If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212203
* Revert "Revert "MachineScheduler: better book-keeping for asserts.""Chad Rosier2014-07-021-7/+11
| | | | | | | | | | | | | This reverts commit r212109, which reverted r212088. However, disable the assert as it's not necessary for correctness. There are several corner cases that the assert needed to handle better for in-order scheduling, but none of them are incorrect scheduler behavior. The assert is mainly there to collect good unit tests like this and ensure that the target-independent scheduler is working as expected with the various machine models. llvm-svn: 212187
OpenPOWER on IntegriCloud