summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Fix invalid displacement created by LocalStackAllocUlrich Weigand2014-07-111-10/+10
| | | | | | | | | | | | | | | | | | | | | | | This commit fixes a bug in PPCRegisterInfo::isFrameOffsetLegal that could result in the LocalStackAlloc pass creating an MI instruction out-of-range displacement: %vreg17<def> = LD 33184, %vreg31; mem:LD8[%g](align=32) %G8RC:%vreg17 G8RC_and_G8RC_NOX0:%vreg31 (In final assembler output the top bits are stripped off, resulting in a negative offset loading from below the stack pointer.) Common code expects the isFrameOffsetLegal routine to verify whether adding a given offset to the offset already present in the instruction results in a valid displacement. However, on PowerPC the routine did not take the already present instruction offset into account. This commit fixes isFrameOffsetLegal to add the instruction offset, and updates a local caller (needsFrameBaseReg) to no longer add the instruction offset itself before calling isFrameOffsetLegal. Reviewed by Hal Finkel. llvm-svn: 212832
* R600/SI: Use i32 vectors for resources and samplersMarek Olsak2014-07-112-5/+5
| | | | | | | | This affects new intrinsics only. What surprises me is that v32i8 still works. llvm-svn: 212831
* R600/SI: add sample and image intrinsics exposing all instruction fieldsMarek Olsak2014-07-112-48/+192
| | | | | | | | | | | We need the intrinsics with offsets, so why not just add them all. The R128 parameter will also be useful for reducing SGPR usage. GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent", so Mesa will probably translate those to slc, glc, etc. When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics. llvm-svn: 212830
* R600/SI: fix shadow mapping for 1D and 2D array texturesMarek Olsak2014-07-111-1/+1
| | | | | | | It was conflicting with def TEX_SHADOW_ARRAY, which also handles them. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 212829
* raw_svector_ostream: grow and reserve atomicallyAlp Toker2014-07-111-15/+17
| | | | | | | | | | Including the scratch buffer size in the initial reservation eliminates the subsequent malloc+move operation and offers a healthier constant growth with less memory wastage. When doing this, take care to avoid invalidating the source buffer. llvm-svn: 212816
* ARM: Allow __fp16 as a function arg or return type for AArch64Oliver Stannard2014-07-113-4/+12
| | | | | | | ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. llvm-svn: 212812
* [X86] Fix the inversion of low and high bits for the lowering of MUL_LOHI.Quentin Colombet2014-07-111-9/+41
| | | | | | | | Also add a few comments. <rdar://problem/17581756> llvm-svn: 212808
* Fixup PHIs in LowerSwitch when a Leaf node is not emitted.Marcello Maggioni2014-07-111-10/+31
| | | | | | | | This commit fixes bug http://llvm.org/bugs/show_bug.cgi?id=20103. Thanks to Qwertyuiop for the report and the proposed fix. llvm-svn: 212802
* [X86] AVX512: Improve readability of isCDisp8Adam Nemet2014-07-111-3/+12
| | | | | | | | No functional change. As I was trying to understand this function, I found that variables were reused with confusing names and the broadcast case was a bit too implicit. Hopefully, this is an improvement. llvm-svn: 212795
* [X86] AVX512: Simplify logic in isCDisp8Adam Nemet2014-07-111-6/+6
| | | | | | | | | | | | It was computing the VL/n case as: MemObjSize = VectorByteSize / ElemByteSize / Divider * ElemByteSize ElemByteSize not only falls out but VectorByteSize/Divider now actually matches the definition of VL/n. Also some formatting fixes. llvm-svn: 212794
* Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from ↵David Blaikie2014-07-113-34/+4
| | | | | | | | | | | | instructions within a function, lead to the function itself."" This reverts commit r212776. Nope, still seems to be failing on the sanitizer bots... but hey, not the msan self-host anymore, it's failing in asan now. I'll start looking there next. llvm-svn: 212793
* Partially fix PR20058: reduce compile time for loop unrolling with very high ↵Mark Heffernan2014-07-101-7/+17
| | | | | | count by reducing calls to SE->forgetLoop llvm-svn: 212782
* [RuntimeDyld] Improve error diagnostic in RuntimeDyldChecker.Lang Hames2014-07-101-4/+15
| | | | | | | | | The compiler often emits assembler-local labels (beginning with 'L') for use in relocation expressions, however these aren't included in the object files. Teach RuntimeDyldChecker to warn the user if they try to use one of these in an expression, since it will never work. llvm-svn: 212777
* Reapply "DebugInfo: Ensure that all debug location scope chains from ↵David Blaikie2014-07-103-4/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instructions within a function, lead to the function itself." Committed in r212205 and reverted in r212226 due to msan self-hosting failure, I believe I've got that fixed by r212761 to Clang. Original commit message: "Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions." llvm-svn: 212776
* R600: Implement float to long/ulongJan Vesely2014-07-103-2/+18
| | | | | | | | | | | | | | Use alg. from LegalizeDAG.cpp Move Expand setting to SIISellowering v2: Extend existing tests instead of creating new ones v3: use separate LowerFPTOSINT function v4: use TargetLowering::expandFP_TO_SINT add comment about using FP_TO_SINT for uints Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 212773
* SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizerJan Vesely2014-07-102-58/+65
| | | | | | | | | | | Move the code to a helper function to allow calls from TypeLegalizer. No functionality change intended Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Owen Anderson <resistor@mac.com> llvm-svn: 212772
* Use the integrated assembler by default on OpenBSD.Brad Smith2014-07-101-1/+2
| | | | llvm-svn: 212771
* [mips] Emit two CFI offset directives per double precision SDC1/LDC1Zoran Jovanovic2014-07-102-4/+21
| | | | | | | instead of just one for FR=1 registers Differential Revision: http://reviews.llvm.org/D4310 llvm-svn: 212769
* Revert "Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), ↵Matt Arsenault2014-07-101-0/+13
| | | | | | | | (trunc b) combine."" Don't try to convert the select condition type. llvm-svn: 212750
* [DAG] Further improve the logic in DAGCombiner that folds a pair of shuffles ↵Andrea Di Biagio2014-07-101-14/+51
| | | | | | | | | | | | | | | | | | | | | into a single shuffle if the resulting mask is legal. This patch teaches the DAGCombiner how to fold shuffles according to the following new rules: 1. shuffle(shuffle(x, y), undef) -> x 2. shuffle(shuffle(x, y), undef) -> y 3. shuffle(shuffle(x, y), undef) -> shuffle(x, undef) 4. shuffle(shuffle(x, y), undef) -> shuffle(y, undef) The backend avoids to combine shuffles according to rules 3. and 4. if the resulting shuffle does not have a legal mask. This is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes during vector legalization. Added test case combine-vec-shuffle-2.ll to verify that we correctly triggers the new rules when combining shuffles. llvm-svn: 212748
* [X86] Mark pseudo instruction TEST8ri_NOEREX as hasSIdeEffects=0.Akira Hatanaka2014-07-102-2/+5
| | | | | | | | | Also, add a case clause in X86InstrInfo::shouldScheduleAdjacent to enable macro-fusion. <rdar://problem/15680770> llvm-svn: 212747
* Add the CSR company and the Kalimba DSP processor to Triple.Eric Christopher2014-07-101-0/+9
| | | | | | Patch by Matthew Gardiner with fixes by me. llvm-svn: 212745
* Make it possible for the Subtarget to change between functionEric Christopher2014-07-109-56/+57
| | | | | | | | passes in the mips back end. This, unfortunately, required a bit of churn in the various predicates to use a pointer rather than a reference. llvm-svn: 212744
* InstCombine: Fix a crash in Descale for multiply-by-zeroDuncan P. N. Exon Smith2014-07-101-0/+6
| | | | | | | | | | Fix a crash in `InstCombiner::Descale()` when a multiply-by-zero gets created as an argument to a GEP partway through an iteration, causing -instcombine to optimize the GEP before the multiply. rdar://problem/17615671 llvm-svn: 212742
* IR: Aliases don't belong to an explicit comdatDavid Majnemer2014-07-101-5/+0
| | | | | | | | | Aliases inherit their comdat from their aliasee, they don't have an explicit comdat. This fixes PR20279. llvm-svn: 212732
* Feeding isSafeToSpeculativelyExecute its DataLayout pointer (in Sink)Hal Finkel2014-07-101-1/+5
| | | | | | | | | | This is the one remaining place I see where passing isSafeToSpeculativelyExecute a DataLayout pointer might matter (at least for loads) -- I think I got the others in r212720. Most of the other remaining callers of isSafeToSpeculativelyExecute only use it for call sites (or otherwise exclude loads). llvm-svn: 212730
* Mips: Silence a -Wcovered-switch-defaultDavid Majnemer2014-07-101-2/+2
| | | | | | | | | Remove a default label which covered no enumerators, replace it with a llvm_unreachable. No functionality changed. llvm-svn: 212729
* [mips] Added FPXX modeless calling convention.Zoran Jovanovic2014-07-105-1/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D4293 llvm-svn: 212726
* [AArch64] Add logical alias instructions to MC AsmParserArnaud A. de Grandmaison2014-07-103-14/+76
| | | | | | | | | | | | | | | | This patch teaches the AsmParser to accept some logical+immediate instructions and convert them as shown: bic Rd, Rn, #imm -> and Rd, Rn, #~imm bics Rd, Rn, #imm -> ands Rd, Rn, #~imm orn Rd, Rn, #imm -> orr Rd, Rn, #~imm eon Rd, Rn, #imm -> eor Rd, Rn, #~imm Those instructions are an alternate syntax available to assembly coders, and are needed in order to support code already compiling with some other assemblers. For example, the bic construct is used by the linux kernel. llvm-svn: 212722
* Feeding isSafeToSpeculativelyExecute its DataLayout pointerHal Finkel2014-07-105-34/+48
| | | | | | | | | | | | | | isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the past, this was mainly used to make better decisions regarding divisions known not to trap, and so was not all that important for users concerned with "cheap" instructions. However, now it also helps look through bitcasts for dereferencable loads, and will also be important if/when we add a dereferencable pointer attribute. This is some initial work to feed a DataLayout pointer through to callers of isSafeToSpeculativelyExecute, generally where one was already available. llvm-svn: 212720
* AArch64: correctly fast-isel i8 & i16 multipliesTim Northover2014-07-101-0/+1
| | | | | | | | We were asking for a register for type i8 or i16 which caused an assert. rdar://problem/17620015 llvm-svn: 212718
* [mips] Add support for -modd-spreg/-mno-odd-spregDaniel Sanders2014-07-1011-98/+250
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When -mno-odd-spreg is in effect, 32-bit floating point values are not permitted in odd FPU registers. The option also prohibits 32-bit and 64-bit floating point comparison results from being written to odd registers. This option has three purposes: * It allows support for certain MIPS implementations such as loongson-3a that do not allow the use of odd registers for single precision arithmetic. * When using -mfpxx, -mno-odd-spreg is the default and this allows us to statically check that code is compliant with the O32 FPXX ABI since mtc1/mfc1 instructions to/from odd registers are guaranteed not to appear for any reason. Once this has been established, the user can then re-enable -modd-spreg to regain the use of all 32 single-precision registers. * When using -mfp64 and -mno-odd-spreg together, an O32 extension named O32 FP64A is used as the ABI. This is intended to provide almost all functionality of an FR=1 processor but can also be executed on a FR=0 core with the assistance of a hardware compatibility mode which emulates FR=0 behaviour on an FR=1 processor. * Added '.module oddspreg' and '.module nooddspreg' each of which update the .MIPS.abiflags section appropriately * Moved setFpABI() call inside emitDirectiveModuleFP() so that the caller doesn't have to remember to do it. * MipsABIFlags now calculates the flags1 and flags2 member on demand rather than trying to maintain them in the same format they will be emitted in. There is one portion of the -mfp64 and -mno-odd-spreg combination that is not implemented yet. Moves to/from odd-numbered double-precision registers must not use mtc1. I will fix this in a follow-up. Differential Revision: http://reviews.llvm.org/D4383 llvm-svn: 212717
* [x32] Add AsmBackend for X32 which uses ELF32 with x86_64 (the author is ↵Zinovy Nis2014-07-101-0/+14
| | | | | | | | | | Pavel Chupin). This is minimal change for backend required to have "hello world" compiled and working on x32 target (x86_64-linux-gnux32). More patches for x32 will follow. Differential Revision: http://reviews.llvm.org/D4181 llvm-svn: 212716
* [x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogousChandler Carruth2014-07-106-9/+113
| | | | | | | | | | | | | | | | | | | | to the zero-extend-vector-inreg node introduced previously for the same purpose: manage the type legalization of widened extend operations, especially to support the experimental widening mode for x86. I'm adding both because sign-extend is expanded in terms of any-extend with shifts to propagate the sign bit. This removes the last fundamental scalarization from vec_cast2.ll (a test case that hit many really bad edge cases for widening legalization), although the trunc tests in that file still appear scalarized because the the shuffle legalization is scalarizing. Funny thing, I've been working on that. Some initial experiments with this and SSE2 scenarios is showing moderately good behavior already for sign extension. Still some work to do on the shuffle combining on X86 before we're generating optimal sequences, but avoiding scalarization is a huge step forward. llvm-svn: 212714
* [SystemZ] Use SystemZCallingConv.td to define callee-saved registersRichard Sandiford2014-07-105-16/+23
| | | | | | Just a clean-up. No behavioral change intended. llvm-svn: 212711
* Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) ↵NAKAMURA Takumi2014-07-101-14/+0
| | | | | | | | combine." This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations. llvm-svn: 212708
* [SystemZ] Tweak instruction format classificationsRichard Sandiford2014-07-102-53/+43
| | | | | | | | | | There's no real need to have Shift as a separate format type from Binary. The comments for other format types were too specific and in some cases no longer accurate. Just a clean-up, no behavioral change intended. llvm-svn: 212707
* [x86] Add another combine that is particularly useful for the new vectorChandler Carruth2014-07-101-0/+41
| | | | | | | | | | shuffle lowering: match shuffle patterns equivalent to an unpcklwd or unpckhwd instruction. This allows us to use generic lowering code for v8i16 shuffles and match the unpack pattern late. llvm-svn: 212705
* [SystemZ] Add MC support for LEDBRA, LEXBRA and LDXBRARichard Sandiford2014-07-101-0/+7
| | | | | | | These instructions aren't used for codegen since the original L*DB instructions are suitable for fround. llvm-svn: 212703
* [SystemZ] Avoid using i8 constants for immediate fieldsRichard Sandiford2014-07-105-58/+54
| | | | | | | | | | | | Immediate fields that have no natural MVT type tended to use i8 if the field was small enough. This was a bit confusing since i8 isn't a legal type for the target. Fields for short immediates in a 32-bit or 64-bit operation use i32 or i64 instead, so it would be better to do the same for all fields. No behavioral change intended. llvm-svn: 212702
* [SystemZ] Fix FPR dwarf numberingRichard Sandiford2014-07-101-1/+24
| | | | | | | | The dwarf FPR numbers are supposed to have the order F0, F2, F4, F6, F1, F3, F5, F7, F8, etc., which matches the pairing of registers for long doubles. E.g. a long double stored in F0 is paired with F2. llvm-svn: 212701
* Make it possible for ints/floats to return different values from ↵Daniel Sanders2014-07-1011-50/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697
* [x86] Expand the target DAG combining for PSHUFD nodes to be able toChandler Carruth2014-07-101-1/+34
| | | | | | | | | | combine into half-shuffles through unpack instructions that expand the half to a whole vector without messing with the dword lanes. This fixes some redundant instructions in splat-like lowerings for v16i8, which are now getting to be *really* nice. llvm-svn: 212695
* [x86] Tweak the v16i8 single input special case lowering for shufflesChandler Carruth2014-07-101-34/+44
| | | | | | | | | | | | | | | | | | | that splat i8s into i16s. Previously, we would try much too hard to arrange a sequence of i8s in one half of the input such that we could unpack them into i16s and shuffle those into place. This isn't always going to be a cheaper i8 shuffle than our other strategies. The case where it is always going to be cheaper is when we can arrange all the necessary inputs into one half using just i16 shuffles. It happens that viewing the problem this way also makes it much easier to produce an efficient set of shuffles to move the inputs into one half and then unpack them. With this, our splat code gets one step closer to being not terrible with the new experimental lowering strategy. It also exposes two combines missing which I will add next. llvm-svn: 212692
* Fix isDereferenceablePointer not to try to take the size of an unsized type.Hal Finkel2014-07-101-1/+2
| | | | | | I'll add a test-case shortly. llvm-svn: 212687
* Allow isDereferenceablePointer to look through some bitcastsHal Finkel2014-07-106-20/+41
| | | | | | | | | | | | | | | | isDereferenceablePointer should not give up upon encountering any bitcast. If we're casting from a pointer to a larger type to a pointer to a small type, we can continue by examining the bitcast's operand. This missing capability was noted in a comment in the function. In order for this to work, isDereferenceablePointer now takes an optional DataLayout pointer (essentially all callers already had such a pointer available). Most code uses isDereferenceablePointer though isSafeToSpeculativelyExecute (which already took an optional DataLayout pointer), and to enable the LICM test case, LICM needs to actually provide its DL pointer to isSafeToSpeculativelyExecute (which it was not doing previously). llvm-svn: 212686
* MC: modernise for loopSaleem Abdulrasool2014-07-101-13/+9
| | | | | | Convert a for loop to range bsaed form. NFC. llvm-svn: 212684
* MC: add and use an accessor for WinCFISaleem Abdulrasool2014-07-101-14/+14
| | | | | | | | | | This adds a utility method to access the WinCFI information in bulk and uses that to iterate rather than requesting the count and individually iterating them. This is in preparation for restructuring WinCFI handling to enable more clear sharing across architectures to enable unwind information emission for Windows on ARM. llvm-svn: 212683
* Remove move assignment operator to appease older GCCs.Peter Collingbourne2014-07-101-5/+0
| | | | llvm-svn: 212682
* [x86] Initial improvements to the new shuffle lowering for v16i8Chandler Carruth2014-07-101-10/+36
| | | | | | | | | | | | | shuffles specifically for cases where a small subset of the elements in the input vector are actually used. This is specifically targetted at improving the shuffles generated for trunc operations, but also helps out splat-like operations. There is still some really low-hanging fruit here that I want to address but this is a huge step in the right direction. llvm-svn: 212680
OpenPOWER on IntegriCloud