summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Fix indentation. No functional change.Craig Topper2013-07-111-8/+8
| | | | llvm-svn: 186065
* Fix a warning.Nadav Rotem2013-07-111-2/+1
| | | | llvm-svn: 186064
* SLPVectorizer: refactor the code that places extracts. Place the code that ↵Nadav Rotem2013-07-111-41/+131
| | | | | | decides where to put extracts in the build-tree phase. This allows us to take the cost of the extracts into account. llvm-svn: 186058
* Teach TailRecursionElimination to handle certain cases of nocapture escaping ↵Michael Gottesman2013-07-111-64/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE *or* mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (*NOTE* B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. *NOTE* This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. llvm-svn: 186057
* Don't assert if we can't constant fold extract/insertvalueHal Finkel2013-07-101-8/+21
| | | | | | | | | | | | | | | A non-constant-foldable static initializer expression containing insertvalue or extractvalue had been causing an assert: Constants.cpp:1971: Assertion `FC && "ExtractValue constant expr couldn't be folded!"' failed. Now we report a more-sensible "Unsupported expression in static initializer" error instead. Fixes PR15417. llvm-svn: 186044
* Find the symbol table on archives created on OS X.Rafael Espindola2013-07-101-3/+14
| | | | llvm-svn: 186041
* Put ELF COMDAT relocations into the relevant COMDAT group.Tim Northover2013-07-101-2/+9
| | | | | | | | Patch from Игорь Пашев (I do hope we support utf-8 commit messages; I also hope he'll forgive me for transliterating it as Igor Pashev in case things go horribly wrong). llvm-svn: 186034
* Remove trailing whitespacStephen Lin2013-07-101-2/+2
| | | | llvm-svn: 186032
* Don't crash in 'llvm -s' when an archive has no symtab.Rafael Espindola2013-07-101-1/+7
| | | | llvm-svn: 186029
* [objc-arc] Changed 'mode: c++' => 'C++' at Nick Lewycky's suggestion. Also ↵Michael Gottesman2013-07-107-7/+7
| | | | | | removed unnecessary mode: c++ lines from .cpp files. llvm-svn: 186026
* MemoryBuffer::getFile handles zero sized files, no need to duplicate the test.Rafael Espindola2013-07-101-21/+2
| | | | llvm-svn: 186018
* Replacing an empty switch with its moral equivalent. No functional changes ↵Aaron Ballman2013-07-102-8/+2
| | | | | | intended. llvm-svn: 186017
* Use status to implement file_size.Rafael Espindola2013-07-102-35/+1
| | | | | | | | | | The status function is already using a syscall that returns the file size. Remember it and implement file_size as a simple wrapper. No functionally change, but clients that already use status now can avoid calling file_size. llvm-svn: 186016
* Use the appropriate unsigned int type for the offset.Adrian Prantl2013-07-101-2/+3
| | | | llvm-svn: 186015
* Safeguard DBG_VALUE handling. Unbreaks the ASAN buildbot.Adrian Prantl2013-07-101-1/+2
| | | | llvm-svn: 186014
* Simplify code.Craig Topper2013-07-101-6/+2
| | | | llvm-svn: 186013
* R600/SI: Initial local memory supportMichel Danzer2013-07-106-3/+34
| | | | | | | Enough for the radeonsi driver to use it for calculating derivatives. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186012
* R600/SI: Add pattern for the AMDGPU.barrier.local intrinsicMichel Danzer2013-07-101-1/+10
| | | | | | | lit test coverage to follow in the next commit. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186011
* R600/SI: Add intrinsic for retrieving the current thread IDMichel Danzer2013-07-102-2/+9
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186010
* R600/SI: Initial support for LDS/GDS instructionsMichel Danzer2013-07-105-0/+68
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186009
* R600/SI: Add intrinsics for texture sampling with user derivativesMichel Danzer2013-07-102-1/+7
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186008
* PPC: Add a better comment about the i64 FI fixupHal Finkel2013-07-101-2/+13
| | | | | | | | In discussing this change with Bill Schmidt, it was decided that the original comment about negative FIs was incorrect. We'll still exclude them for now, but now with a more-accurate explanation. llvm-svn: 186005
* Reverting commit r185999 due to buildboot failure.Vladimir Medic2013-07-102-49/+0
| | | | llvm-svn: 186000
* Add support for Mips break and syscall insructions. The corresponding test ↵Vladimir Medic2013-07-102-0/+49
| | | | | | cases are added. llvm-svn: 185999
* Fix typoStephen Lin2013-07-101-1/+1
| | | | llvm-svn: 185995
* Explicitly define ARMISelLowering::isFMAFasterThanFMulAndFAdd. No ↵Stephen Lin2013-07-101-0/+11
| | | | | | | | | | functionality change. Currently ARM is the only backend that supports FMA instructions (for at least some subtargets) but does not implement this virtual, so FMAs are never generated except from explicit fma intrinsic calls. Apparently this is due to the fact that it supports both fused (one rounding step) and unfused (two rounding step) multiply + add instructions. This patch clarifies that this the case without changing behavior by implementing the virtual function to simply return false, as the default TargetLoweringBase version does. It is possible that some cpus perform the fused version faster than the unfused version and vice-versa, so the function implementation should be revisited if hard data is found. llvm-svn: 185994
* Un-break the buildbot by tweaking the indirection flag.Adrian Prantl2013-07-101-2/+8
| | | | | | Pulled in a testcase from the debuginfo-test suite. llvm-svn: 185993
* Document a known limitation of the status quo.Adrian Prantl2013-07-101-1/+3
| | | | llvm-svn: 185992
* Fix comment.Eric Christopher2013-07-091-1/+1
| | | | llvm-svn: 185984
* ARM: Fix incorrect pack pattern for thumb2Jim Grosbach2013-07-091-1/+6
| | | | | | | | | | | | | | | Propagate the fix from r185712 to Thumb2 codegen as well. Original commit message applies here as well: A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. rdar://14338767 llvm-svn: 185982
* Implement categories for special case lists.Peter Collingbourne2013-07-092-27/+93
| | | | | | | | | | | | | | | | | | | | | | | | A special case list can now specify categories for specific globals, which can be used to instruct an instrumentation pass to treat certain functions or global variables in a specific way, such as by omitting certain aspects of instrumentation while keeping others, or informing the instrumentation pass that a specific uninstrumentable function has certain semantics, thus allowing the pass to instrument callers according to those semantics. For example, AddressSanitizer now uses the "init" category instead of global-init prefixes for globals whose initializers should not be instrumented, but which in all other respects should be instrumented. The motivating use case is DataFlowSanitizer, which will have a number of different categories for uninstrumentable functions, such as "functional" which specifies that a function has pure functional semantics, or "discard" which indicates that a function's return value should not be labelled. Differential Revision: http://llvm-reviews.chandlerc.com/D1092 llvm-svn: 185978
* Introduce a SpecialCaseList ctor which takes a MemoryBuffer to makePeter Collingbourne2013-07-091-1/+9
| | | | | | | | it more unit testable, and fix memory leak in the other ctor. Differential Revision: http://llvm-reviews.chandlerc.com/D1090 llvm-svn: 185976
* Rename BlackList class to SpecialCaseList and move it to Transforms/Utils.Peter Collingbourne2013-07-096-21/+21
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D1089 llvm-svn: 185975
* InstSimplify: X >> X -> 0David Majnemer2013-07-091-0/+8
| | | | llvm-svn: 185973
* Typo.Adrian Prantl2013-07-091-1/+1
| | | | llvm-svn: 185971
* Fix PR16571, which is a bug in the code that checks that all of the types in ↵Nadav Rotem2013-07-091-1/+3
| | | | | | the bundle are uniform. llvm-svn: 185970
* Reapply an improved version of r180816/180817.Adrian Prantl2013-07-099-47/+75
| | | | | | | | | | | | | | | Change the informal convention of DBG_VALUE machine instructions so that we can express a register-indirect address with an offset of 0. The old convention was that a DBG_VALUE is a register-indirect value if the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE is register-indirect if the first operand is a register and the second operand is an immediate. For plain register values the combination reg, reg is used. MachineInstrBuilder::BuildMI knows how to build the new DBG_VALUES. rdar://problem/13658587 llvm-svn: 185966
* WidenVecRes_BUILD_VECTOR must use the first operand's typeHal Finkel2013-07-091-1/+4
| | | | | | | | | | | Because integer BUILD_VECTOR operands may have a larger type than the result's vector element type, and all operands must have the same type, when widening a BUILD_VECTOR node by adding UNDEFs, we cannot use the vector element type, but rather must use the type of the existing operands. Another bug found by llvm-stress. llvm-svn: 185960
* [PowerPC] Better fix for PR16556.Bill Schmidt2013-07-091-9/+3
| | | | | | | | | | | | | | | | | A more complete example of the bug in PR16556 was recently provided, showing that the previous fix was not sufficient. The previous fix is reverted herein. The real problem is that ReplaceNodeResults() uses LowerFP_TO_INT as custom lowering for FP_TO_SINT during type legalization, without checking whether the input type is handled by that routine. LowerFP_TO_INT requires the input to be f32 or f64, so we fail when the input is ppcf128. I'm leaving the test case from the initial fix (r185821) in place, and adding the new test as another crash-only check. llvm-svn: 185959
* AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and allStephen Lin2013-07-0910-29/+86
| | | | | | | | | | | | | | | | | | | | | | | in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in order to resolve the following issues with fmuladd (i.e. optional FMA) intrinsics: 1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd intrinsics even if the subtarget does not support FMA instructions, leading to laughably bad code generation in some situations. 2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128, resulting in a call to a software fp128 FMA implementation. 3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize, etc. to types that support hardware FMAs. The function has also been slightly renamed for consistency and to force a merge/build conflict for any out-of-tree target implementing it. To resolve, see comments and fixed in-tree examples. llvm-svn: 185956
* Don't crash in SE dealing with ashr x, -1Hal Finkel2013-07-091-1/+1
| | | | | | | | | | | | | | ScalarEvolution::getSignedRange uses ComputeNumSignBits from ValueTracking on ashr instructions. ComputeNumSignBits can return zero, but this case was not handled correctly by the code in getSignedRange which was calling: APInt::getSignedMinValue(BitWidth).ashr(NS - 1) with NS = 0, resulting in an assertion failure in APInt::ashr. Now, we just return the conservative result (as with NS == 1). Another bug found by llvm-stress. llvm-svn: 185955
* ValueTracking: Fix bugs in isKnownToBeAPowerOfTwoDavid Majnemer2013-07-091-6/+4
| | | | | | | (add nsw x, (and x, y)) isn't a power of two if x is zero, it's zero (add nsw x, (xor x, y)) isn't a power of two if y has bits set that aren't set in x llvm-svn: 185954
* Set the default insert point to the first instruction, and not to end()Nadav Rotem2013-07-091-1/+1
| | | | llvm-svn: 185953
* DAGCombine tryFoldToZero cannot create illegal types after type legalizationHal Finkel2013-07-091-4/+11
| | | | | | | | | | | When folding sub x, x (and other similar constructs), where x is a vector, the result is a vector of zeros. After type legalization, make sure that the input zero elements have a legal type. This type may be larger than the result's vector element type. This was another bug found by llvm-stress. llvm-svn: 185949
* [PowerPC] Revert r185476 and fix up TLS variant kindsUlrich Weigand2013-07-095-4/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the commit message to r185476 I wrote: >The PowerPC-specific modifiers VK_PPC_TLSGD and VK_PPC_TLSLD >correspond exactly to the generic modifiers VK_TLSGD and VK_TLSLD. >This causes some confusion with the asm parser, since VK_PPC_TLSGD >is output as @tlsgd, which is then read back in as VK_TLSGD. > >To avoid this confusion, this patch removes the PowerPC-specific >modifiers and uses the generic modifiers throughout. (The only >drawback is that the generic modifiers are printed in upper case >while the usual convention on PowerPC is to use lower-case modifiers. >But this is just a cosmetic issue.) This was unfortunately incorrect, there is is fact another, serious drawback to using the default VK_TLSLD/VK_TLSGD variant kinds: using these causes ELFObjectWriter::RelocNeedsGOT to return true, which in turn causes the ELFObjectWriter to emit an undefined reference to _GLOBAL_OFFSET_TABLE_. This is a problem on powerpc64, because it uses the TOC instead of the GOT, and the linker does not provide _GLOBAL_OFFSET_TABLE_, so the symbol remains undefined. This means shared libraries using TLS built with the integrated assembler are currently broken. While the whole RelocNeedsGOT / _GLOBAL_OFFSET_TABLE_ situation probably ought to be properly fixed at some point, for now I'm simply reverting the r185476 commit. Now this in turn exposes the breakage of handling @tlsgd/@tlsld in the asm parser that this check-in was originally intended to fix. To avoid this regression, I'm also adding a different fix for this problem: while common code now parses @tlsgd as VK_TLSGD, a special hack in the asm parser translates this code to the platform-specific VK_PPC_TLSGD that the back-end now expects. While this is not really pretty, it's self-contained and shouldn't hurt anything else for now. One the underlying problem is fixed, this hack can be reverted again. llvm-svn: 185945
* R600: Do not predicated basic block with multiple alu clauseVincent Lejeune2013-07-096-7/+64
| | | | | | | | | Test is not included as it is several 1000 lines long. To test this functionnality, a test case must generate at least 2 ALU clauses, where an ALU clause is ~110 instructions long. NOTE: This is a candidate for the stable branch. llvm-svn: 185943
* R600: Fix a rare bug where swizzle optimization returns wrong valuesVincent Lejeune2013-07-091-2/+3
| | | | llvm-svn: 185942
* R600: Fix wrong export reswizzlingVincent Lejeune2013-07-091-4/+0
| | | | llvm-svn: 185941
* R600: Use DAG lowering pass to handle fcos/fsinVincent Lejeune2013-07-094-23/+52
| | | | | NOTE: This is a candidate for the stable branch. llvm-svn: 185940
* R600: Print Export SwizzleVincent Lejeune2013-07-091-2/+2
| | | | llvm-svn: 185939
OpenPOWER on IntegriCloud