summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Test commitOlivier Sallenave2015-01-071-0/+1
| | | | llvm-svn: 225368
* [X86] Fix 512->256 typo in comments. NFC.Ahmed Bougacha2015-01-071-2/+2
| | | | llvm-svn: 225367
* Add a missing file from 225365Philip Reames2015-01-071-0/+54
| | | | llvm-svn: 225366
* Introduce an example statepoint GC strategyPhilip Reames2015-01-074-0/+50
| | | | | | | | | | | | | | This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.) Most of the validation logic added here as proof of concept will soon move in to the Verifier. That move is dependent on http://reviews.llvm.org/D6811 There was discussion in the review thread about addrspace(1) being reserved for something. I'm going to follow up on a seperate llvmdev thread. If needed, I'll update all the code at once. Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others. Differential Revision: http://reviews.llvm.org/D6808 llvm-svn: 225365
* X86: Allow the stack probe size to be configurable per functionDavid Majnemer2015-01-071-3/+9
| | | | | | | | | | | | | | | | | | | LLVM emits stack probes on Windows targets to ensure that the stack is correctly accessed. However, the amount of stack allocated before emitting such a probe is hardcoded to 4096. It is desirable to have this be configurable so that a function might opt-out of stack probes. Our level of granularity is at the function level instead of, say, the module level to permit proper generation of code after LTO. Patch by Andrew H! N.B. The inliner needs to be updated to properly consider what happens after inlining a function with a specific stack-probe-size into another function with a different stack-probe-size. llvm-svn: 225360
* R600/SI: Refactor SIFoldOperands to simplify immediate foldingTom Stellard2015-01-071-25/+54
| | | | | | This will make a future patch much less intrusive. llvm-svn: 225358
* [X86] Teach FCOPYSIGN lowering to recognize constant magnitudes.Ahmed Bougacha2015-01-071-6/+19
| | | | | | | | | | | | | | | | | | | | | For code like: float foo(float x) { return copysign(1.0, x); } We used to generate: andps <-0.000000e+00,0,0,0>, %xmm0 movss <1.000000e+00>, %xmm1 andps <nan>, %xmm1 orps %xmm0, %xmm1 Basically doing an abs(1.0f) in the two middle instructions. We now generate: andps <-0.000000e+00,0,0,0>, %xmm0 orps <1.000000e+00,0,0,0>, %xmm0 Builds on cleanups r223415, r223542. rdar://19049548 Differential Revision: http://reviews.llvm.org/D6555 llvm-svn: 225357
* New method SDep::isNormalMemoryOrBarrier() in ScheduleDAGInstrs.cpp.Jonas Paulsson2015-01-071-5/+6
| | | | | | | | | | | | | | | | | | | | Used to iterate over previously added memory dependencies in adjustChainDeps() and iterateChainSucc(). SDep::isCtrl() was previously used in these places, that also gave anti and output edges. The code may be worse if these are followed, because MisNeedChainEdge() will conservatively return true since a non-memory instruction has no memory operands, and a false chain dep will be added. It is also unnecessary since all memory accesses of interest will be reached by memory dependencies, and there is a budget limit for the number of edges traversed. This problem was found on an out-of-tree target with enabled alias analysis. No test case for an in-tree target has been found. Reviewed by Hal Finkel. llvm-svn: 225351
* Fix typos in comment and option help texts.Jonas Paulsson2015-01-071-3/+3
| | | | | | For -enable-aa-sched-mi and -use-tbaa-in-sched-mi. llvm-svn: 225350
* Fix regression in r225266.Asiri Rathnayake2015-01-071-1/+1
| | | | | | | | The change in r225266 was reviewed under D6722. But the commit r225266 has a typo, causing some MCHammer failures. This patch fixes it. Change-Id: I573efcff25003af7478ac02548ebbe929fc7f5fd llvm-svn: 225347
* [X86] Merge a switch statement inside a default case of another switch ↵Craig Topper2015-01-071-160/+155
| | | | | | statement on the same variable. There was no additional code in the default so this should be no functional change. llvm-svn: 225345
* [X86] Don't mark the shift by 1 instructions as isConvertibleToThreeAddress. ↵Craig Topper2015-01-071-1/+1
| | | | | | There is no handling for them. llvm-svn: 225344
* [X86] Remove some unused TYPE enums from the disassembler.Craig Topper2015-01-073-18/+1
| | | | llvm-svn: 225343
* Revert r225165 and r225169Karthik Bhat2015-01-071-39/+0
| | | | | | | | Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case where the original subtraction could have caused an overflow. Reverting the same. llvm-svn: 225341
* [PM] Fix a pretty nasty bug where the new pass manager would invalidateChandler Carruth2015-01-072-18/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | passes too many time. I think this is actually the issue that someone raised with me at the developer's meeting and in an email, but that we never really got to the bottom of. Having all the testing utilities made it much easier to dig down and uncover the core issue. When a pass manager is running many passes over a single function, we need it to invalidate the analyses between each run so that they can be re-computed as needed. We also need to track the intersection of preserved higher-level analyses across all the passes that we run (for example, if there is one module analysis which all the function analyses preserve, we want to track that and propagate it). Unfortunately, this interacted poorly with any enclosing pass adaptor between two IR units. It would see the intersection of preserved analyses, and need to invalidate any other analyses, but some of the un-preserved analyses might have already been invalidated *and recomputed*! We would fail to propagate the fact that the analysis had already been invalidated. The solution to this struck me as really strange at first, but the more I thought about it, the more natural it seemed. After a nice discussion with Duncan about it on IRC, it seemed even nicer. The idea is that invalidating an analysis *causes* it to be preserved! Preserving the lack of result is trivial. If it is recomputed, great. Until something *else* invalidates it again, we're good. The consequence of this is that the invalidate methods on the analysis manager which operate over many passes now consume their PreservedAnalyses object, update it to "preserve" every analysis pass to which it delivers an invalidation (regardless of whether the pass chooses to be removed, or handles the invalidation itself by updating itself). Then we return this augmented set from the invalidate routine, letting the pass manager take the result and use the intersection of *that* across each pass run to compute the final preserved set. This accounts for all the places where the early invalidation of an analysis has already "preserved" it for a future run. I've beefed up the testing and adjusted the assertions to show that we no longer repeatedly invalidate or compute the analyses across nested pass managers. llvm-svn: 225333
* R600/SI: Add check for amdgcn triple forgotten in r225276.Tom Stellard2015-01-071-2/+3
| | | | llvm-svn: 225331
* Analysis: Reformulate WillNotOverflowUnsignedAdd for reusabilityDavid Majnemer2015-01-074-45/+41
| | | | | | | | WillNotOverflowUnsignedAdd's smarts will live in ValueTracking as computeOverflowForUnsignedAdd. It now returns a tri-state result: never overflows, always overflows and sometimes overflows. llvm-svn: 225329
* InstCombine: Just a small tidy-upDavid Majnemer2015-01-071-3/+2
| | | | llvm-svn: 225328
* [PowerPC] Transform a README.txt entry into a FIXMEHal Finkel2015-01-072-14/+9
| | | | | | | | | | Remove the README.txt entry regarding register allocation of CR logical ops, and replace it with a FIXME in PPCInstrInfo.td. The text in the README.txt was not really accurate, and thanks goes to Pat Haugen (and Bill Schmidt) from IBM for clarifying what was intended and highlighting the relevant text in the ISA specification. llvm-svn: 225325
* Revert r224935 "Refactor duplicated code. No intended functionality change."Lang Hames2015-01-0610-67/+87
| | | | | | | | This is affecting the behavior of some ObjC++ / AArch64 test cases on Darwin. Reverting to get the bots green while I track down the source of the changed behavior. llvm-svn: 225311
* R600/SI: Add combine for isinfinite patternMatt Arsenault2015-01-062-0/+57
| | | | llvm-svn: 225310
* R600/SI: Pattern match isinf to v_cmp_class instructionsMatt Arsenault2015-01-062-0/+34
| | | | llvm-svn: 225307
* R600/SI: Add basic DAG combines for fp_classMatt Arsenault2015-01-062-1/+50
| | | | llvm-svn: 225306
* R600/SI: Add class intrinsicMatt Arsenault2015-01-067-5/+82
| | | | llvm-svn: 225305
* Change the .ll syntax for comdats and add a syntactic sugar.Rafael Espindola2015-01-063-18/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to make comdats always explicit in the IR, we decided to make the syntax a bit more compact for the case of a GlobalObject in a comdat with the same name. Just dropping the $name causes problems for @foo = globabl i32 0, comdat $bar = comdat ... and declare void @foo() comdat $bar = comdat ... So the syntax is changed to @g1 = globabl i32 0, comdat($c1) @g2 = globabl i32 0, comdat and declare void @foo() comdat($c1) declare void @foo() comdat llvm-svn: 225302
* [PowerPC] Reuse a load operand in int->fp conversionsHal Finkel2015-01-063-41/+142
| | | | | | | | | | | | | | | | | | | | | | | int->fp conversions on PPC must be done through memory loads and stores. On a modern core, this process begins by storing the int value to memory, then loading it using a (sometimes special) FP load instruction. Unfortunately, we would do this even when the value to be converted was itself a load, and we can just use that same memory location instead of copying it to another first. There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs, because the fp_to_int operand has not been lowered when the int_to_fp is being lowered. We handle this specially by invoking fp_to_int's lowering logic (partially) and getting the necessary memory location (some trivial refactoring was done to make this possible). This is all somewhat ugly, and it would be nice if some later CodeGen stage could just clean this stuff up, but because doing so would involve modifying target-specific nodes (or instructions), it is not immediately clear how that would work. Also, remove a related entry from the README.txt for which we now generate reasonable code. llvm-svn: 225301
* [Hexagon] Adding compound jump encodings.Colin LeMahieu2015-01-062-0/+266
| | | | llvm-svn: 225291
* R600/SI: Insert s_waitcnt before s_barrier instructions.Tom Stellard2015-01-061-1/+5
| | | | | | | This ensures that all memory operations are complete when all threads reach the barrier. llvm-svn: 225290
* R600/SI: Fix dependency calculation for DS writes instructions in SIInsertWaitsTom Stellard2015-01-061-0/+23
| | | | | | | | | | | | In DS write instructions, the address operand comes before the value operand(s) which is reversed from every other instruction type. The SIInsertWait assumed that the first use for each instruction was the value, so for DS write it was protecting the address operand with s_waitcnt instructions when it should have been protecting the value operand. llvm-svn: 225289
* Revert "Reapply: Teach SROA how to update debug info for fragmented variables."Adrian Prantl2015-01-061-60/+8
| | | | | | | | | because of a tsan buildbot failure. This reverts commit 225272. Fix should be coming soon. llvm-svn: 225288
* [Hexagon] Adding encoding for misc v4 instructions: boundscheck, tlbmatch, ↵Colin LeMahieu2015-01-063-1/+101
| | | | | | dcfetch. llvm-svn: 225283
* This patch teaches IndVarSimplify to add nuw and nsw to certain kindsSanjoy Das2015-01-061-0/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of operations that provably don't overflow. For example, we can prove %civ.inc below does not sign-overflow. With this change, IndVarSimplify changes %civ.inc to an add nsw. define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) { entry: %length = load i32* %length_ptr, !range !0 %len.sub.1 = sub i32 %length, 1 %upper = icmp slt i32 %init, %len.sub.1 br i1 %upper, label %loop, label %exit loop: %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ] %civ.inc = add i32 %civ, 1 %cmp = icmp slt i32 %civ.inc, %length br i1 %cmp, label %latch, label %break latch: store i32 0, i32* %array %check = icmp slt i32 %civ.inc, %len.sub.1 br i1 %check, label %loop, label %break break: ret i32 %civ.inc exit: ret i32 42 } Differential Revision: http://reviews.llvm.org/D6748 llvm-svn: 225282
* [Hexagon] Adding encoding information for absolute address loads.Colin LeMahieu2015-01-061-124/+186
| | | | llvm-svn: 225279
* SelectionDAGBuilder: move constant initialization out of loopMehdi Amini2015-01-061-15/+19
| | | | | | | | | | No semantic change intended. Reviewers: resistor Differential Revision: http://reviews.llvm.org/D6834 llvm-svn: 225278
* R600/SI: Add a stub GCNTargetMachineTom Stellard2015-01-068-1/+46
| | | | | | | | | | | | This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. llvm-svn: 225277
* Triple: Add amdgcn tripleTom Stellard2015-01-061-1/+8
| | | | | | | This will be used for AMD GPUs with the Graphics Core Next architecture, which are currently using by the r600 triple. llvm-svn: 225276
* R600/SI: Remove MachineFunction dump from AsmPrinterTom Stellard2015-01-061-17/+12
| | | | | | | The dump was dependent on a feature string, which meant that it couldn't be disabled or enable on a per compile basis. llvm-svn: 225275
* [CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz.Andrea Di Biagio2015-01-061-6/+35
| | | | | | | | | | | | | | | | | | | | | This patch improves the logic added at revision 224899 (see review D6728) that teaches the backend when it is profitable to speculate calls to cttz/ctlz. The original algorithm conservatively avoided speculating more than one instruction from a basic block in a control flow grap modelling an if-statement. In particular, the only allowed instruction (excluding the terminator) was a call to cttz/ctlz. However, there are cases where we could be less conservative and still be able to speculate a call to cttz/ctlz. With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the result is zero extended/truncated in the same basic block, and the zext/trunc instruction is "free" for the target. Added new test cases to CodeGen/X86/cttz-ctlz.ll Differential Revision: http://reviews.llvm.org/D6853 llvm-svn: 225274
* Reapply: Teach SROA how to update debug info for fragmented variables.Adrian Prantl2015-01-061-8/+60
| | | | | | | | | | This also rolls in the changes discussed in http://reviews.llvm.org/D6766. Defers migrating the debug info for new allocas until after all partitions are created. Thanks to Chandler for reviewing! llvm-svn: 225272
* Don't loop endlessly for MachO files with 0 ncmdsFilipe Cabecinhas2015-01-061-0/+3
| | | | llvm-svn: 225271
* [Hexagon] Fix 225267. GP register is not yet fully implemented. Removing ↵Colin LeMahieu2015-01-061-2/+2
| | | | | | Uses [GP] maintains existing behavior. llvm-svn: 225270
* Implement a very basic colored syntax highlighting for llvm-dwarfdump.Adrian Prantl2015-01-065-21/+112
| | | | | | | | | | The color scheme is the same as the one used by the colorize dwarfdump script on Darwin. A new --color option can be used to forcibly turn color on or off. http://reviews.llvm.org/D6852 llvm-svn: 225269
* [Hexagon] Adding dealloc_return encoding and absolute address stores.Colin LeMahieu2015-01-065-239/+347
| | | | llvm-svn: 225267
* [ARM] Cleanup so_imm* tblgen defintionsAsiri Rathnayake2015-01-062-109/+43
| | | | | | | | | | | No functional changes. Support for ARM's modified immediate syntax was added in r223113 and r223115 (review: D6408). That patch introduced the mod_imm* tblegen definitions which renders the existing so_imm* definitions redundant. This patch gets rid of them completely. Reviewed as: D6722 llvm-svn: 225266
* Convert fcmp with 0.0 from casted integers to icmpMatt Arsenault2015-01-061-4/+34
| | | | | | | | | | | | | | | | | | | This is already handled in general when it is known the conversion can't lose bits with smaller integer types casted into wider floating point types. This pattern happens somewhat often in GPU programs that cast workitem intrinsics to float, which are often compared with 0. Specifically handle the special case of compares with zero which should also be known to not lose information. I had a more general version of this which allows equality compares if the casted float is exactly representable in the integer, but I'm not 100% confident that is always correct. Also fold cases that aren't integers to true / false. llvm-svn: 225265
* [X86] Add OpSize32 to XBEGIN_4. Add XBEGIN_2 with OpSize16.Craig Topper2015-01-064-7/+40
| | | | | | Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building. llvm-svn: 225256
* InstCombine: Bitcast call arguments from/to pointer/integer typeDavid Majnemer2015-01-061-4/+13
| | | | | | | Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing arguments and return values when DataLayout says it is safe to do so. llvm-svn: 225254
* [X86] Make isel select the 2-byte register form of INC/DEC even in ↵Craig Topper2015-01-065-126/+78
| | | | | | | | non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering. Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness. llvm-svn: 225252
* [PowerPC] Remove old README.txt entry regarding struct passingHal Finkel2015-01-061-8/+0
| | | | | | | Because of how Clang represents structs as arrays (at least on non-Darwin platforms), and what SROA does, etc. this is no longer a problem. llvm-svn: 225251
* X86: Don't make illegal GOTTPOFF relocationsDavid Majnemer2015-01-062-0/+17
| | | | | | | | | | | | | "ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF relocation target a movq or addq instruction. Prohibit the truncation of such loads to movl or addl. This fixes PR22083. Differential Revision: http://reviews.llvm.org/D6839 llvm-svn: 225250
OpenPOWER on IntegriCloud