summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* Rejected r156374: Ordinary PR1255 patch. Due to clang-x86_64-debian-fnt ↵Stepan Dyatkovskiy2012-05-081-33/+0
| | | | | | buildbot failure. llvm-svn: 156377
* Remove 256-bit AVX non-temporal store intrinsics. Similar was previously ↵Craig Topper2012-05-081-0/+24
| | | | | | done for 128-bit. llvm-svn: 156375
* Ordinary patch for PR1255.Stepan Dyatkovskiy2012-05-081-0/+33
| | | | | | | Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object. Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported. llvm-svn: 156374
* Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.Owen Anderson2012-05-071-0/+18
| | | | llvm-svn: 156324
* Teach reassociate to commute FMul's and FAdd's in order to canonicalize the ↵Owen Anderson2012-05-071-0/+16
| | | | | | order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323
* Fix a regression from r147481. This combine should only happen if there is aChad Rosier2012-05-071-1/+2
| | | | | | | single use. rdar://11360370 llvm-svn: 156316
* X86: optimization for -(x != 0)Manman Ren2012-05-071-0/+30
| | | | | | | | | | | | | | | | | This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td: def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>; rdar: 10961709 llvm-svn: 156312
* Add support for the 'l' constraint.Eric Christopher2012-05-071-0/+11
| | | | | | Patch by Jack Carter. llvm-svn: 156294
* Add support for the 'c' constraint.Eric Christopher2012-05-071-1/+7
| | | | | | Patch by Jack Carter. llvm-svn: 156293
* Add support for the 'P' constraint.Eric Christopher2012-05-072-0/+22
| | | | | | Patch by Jack Carter. llvm-svn: 156292
* Add support for the 'O' constraint.Eric Christopher2012-05-072-0/+22
| | | | | | Patch by Jack Carter. llvm-svn: 156285
* Add support for the 'N' inline asm constraint.Eric Christopher2012-05-072-0/+23
| | | | | | Patch by Jack Carter. llvm-svn: 156284
* Add support for the 'L' inline asm constraint.Eric Christopher2012-05-072-1/+22
| | | | | | Patch by Jack Carter. llvm-svn: 156283
* Add support for the inline asm constraint 'K'.Eric Christopher2012-05-072-0/+22
| | | | llvm-svn: 156282
* Add SSE4A MOVNTSS/MOVNTSD instructions.Craig Topper2012-05-071-0/+19
| | | | llvm-svn: 156281
* Support the 'J' constraint.Eric Christopher2012-05-072-0/+22
| | | | | | Patch by Jack Carter. llvm-svn: 156280
* Add support for the 'I' inline asm constraint. Also add testsEric Christopher2012-05-075-0/+99
| | | | | | | | from the previous 2 patches. Patch by Jack Carter. llvm-svn: 156279
* Switch the select to branch transformation on by default.Benjamin Kramer2012-05-062-2/+2
| | | | | | | | | The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
* CodeGenPrepare: Add a transform to turn selects into branches in some cases.Benjamin Kramer2012-05-051-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
* Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for ↵Stepan Dyatkovskiy2012-05-051-2/+5
| | | | | | | | case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231
* This patch adds a new NVPTX back-end to LLVM which supports code generation ↵Justin Holewinski2012-05-0417-0/+1994
| | | | | | | | | | | | | | | | | for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB llvm-svn: 156196
* Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits ↵Sebastian Pop2012-05-041-4/+14
| | | | | | 16-bits encoding of CMN instructions. llvm-svn: 156195
* Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.Craig Topper2012-05-041-0/+8
| | | | llvm-svn: 156156
* Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bitsKevin Enderby2012-05-033-2/+29
| | | | | | | | | for the assembler and disassembler. Which were not being set/read correctly for offsets greater than 22 bits in some cases. Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles! llvm-svn: 156118
* remove calls to calloc if the allocated memory is not used (it was already ↵Nuno Lopes2012-05-031-2/+2
| | | | | | | | being done for malloc) fix a few typos found by Chad in my previous commit llvm-svn: 156110
* Support for target dependent Hexagon VLIW packetizer.Sirish Pande2012-05-034-0/+69
| | | | | | This patch creates and optimizes packets as per Hexagon ISA rules. llvm-svn: 156109
* add support for calloc to objectsize loweringNuno Lopes2012-05-031-0/+20
| | | | llvm-svn: 156102
* Fixed disassembler for vstm/vldm ARM VFP instructions.Silviu Baranga2012-05-031-0/+27
| | | | llvm-svn: 156077
* Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the ↵Craig Topper2012-05-031-1/+1
| | | | | | lower half correctly. Missed in r155982. llvm-svn: 156059
* Fix two-address pass's aggressive instruction commuting heuristics. It's meantEvan Cheng2012-05-032-2/+12
| | | | | | | | | | | | | | | | | | | | | | to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 llvm-svn: 156048
* Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, ↵Owen Anderson2012-05-021-0/+9
| | | | | | just like it now knows for FMULs. llvm-svn: 156029
* Teach DAG combine that multiplication by 1.0 can always be constant folded.Owen Anderson2012-05-021-0/+9
| | | | llvm-svn: 156023
* ARM: Add missing two-operand VBIC aliases.Jim Grosbach2012-05-021-0/+5
| | | | llvm-svn: 156019
* Revert r155853Manman Ren2012-05-021-21/+0
| | | | | | | | The commit is intended to fix rdar://10961709. But it is the root cause of PR12720. Revert it for now. llvm-svn: 155992
* The value held in the vector may be RAUW'ed by some of the canonicalizationBill Wendling2012-05-021-0/+50
| | | | | | | methods. Use a weak value handle to keep up with this. PR12245 llvm-svn: 155984
* Disallow YIELD and other allocated nop hints in pre-ARMv6 architectures.Richard Barton2012-05-024-29/+18
| | | | llvm-svn: 155983
* Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support ↵Craig Topper2012-05-021-0/+14
| | | | | | for AsmPrinter. llvm-svn: 155982
* Strip the pointer casts off of allocas so that the selection DAG can find them.Bill Wendling2012-05-011-0/+17
| | | | | | PR10799 llvm-svn: 155954
* ARM: Add a few missing add->sub aliases w/ 'w' suffix.Jim Grosbach2012-05-011-0/+12
| | | | | | | | | | | | | | Aliases for adding a negative immediate when using an explicit 'w' suffix. E.g., adds.w r2, #-16 adds.w r2, r2, #-16 addw r2, #-16 addw r2, #-16 addw r2, r2, #-16 rdar://11330769 llvm-svn: 155946
* ARM: allow vanilla expressions for movw/movt.Jim Grosbach2012-05-011-0/+5
| | | | | | | | | | Expressions for movw/movt don't always have an :upper16: or :lower16: on them and that's ok. When they don't, it's just a plain [0-65536] immediate result, effectively the same as a :lower16: variant kind. rdar://10550147 llvm-svn: 155941
* MC: Unknown assembler directives are now hard errors.Jim Grosbach2012-05-012-3/+3
| | | | | | | | | | | | | | | | Previously, an unsupported/unknown assembler directive issued a warning. That's generally unsafe, and inconsistent with the behaviour of pretty much every system assembler. Now that the MC assemblers are mature enough to be the default on multiple targets, it's reasonable to issue errors for these. For target or platform directives that need to stay warnings, we should add explicit handlers for them in, e.g., ELFAsmParser.cpp, DarwinAsmParser.cpp, et. al., and issue the warning there. rdar://9246275 llvm-svn: 155926
* X86: optimization for max-like structManman Ren2012-05-011-0/+42
| | | | | | | | | | | | | | | | | | | | | | | This patch will optimize the following cases on X86 (a > b) ? (a-b) : 0 (a >= b) ? (a-b) : 0 (b < a) ? (a-b) : 0 (b <= a) ? (a-b) : 0 FROM movl %edi, %ecx subl %esi, %ecx cmpl %edi, %esi movl $0, %eax cmovll %ecx, %eax TO xorl %eax, %eax subl %esi, %edi cmovll %eax, %edi movl %edi, %eax rdar: 10734411 llvm-svn: 155919
* X86: Use StackRegister instead of FrameRegister in getFrameIndexReference ↵Alexey Samsonov2012-05-011-0/+42
| | | | | | (to generate debug info for local variables) if stack needs realignment llvm-svn: 155917
* Regression test for PR2960.Jay Foad2012-05-011-0/+13
| | | | llvm-svn: 155912
* An instruction in a loop is not guaranteed to be executed just because the loopNick Lewycky2012-05-011-0/+22
| | | | | | has no exit blocks. Fixes PR12706! llvm-svn: 155884
* Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. FixesLang Hames2012-05-011-0/+68
| | | | | | | | | <rdar://problem/11291436>. This is a second attempt at a fix for this, the first was r155468. Thanks to Chandler, Bob and others for the feedback that helped me improve this. llvm-svn: 155866
* X86: optimization for -(x != 0)Manman Ren2012-04-301-0/+21
| | | | | | | | | | | | | This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax llvm-svn: 155853
* test/CodeGen/X86/select.ll: remove spacesManman Ren2012-04-301-1/+1
| | | | llvm-svn: 155840
* Fix fastcc structure return with fast-isel on x86-32Derek Schuff2012-04-301-0/+14
| | | | | | | | | | | | | | On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. (this time, actually commit what was reviewed!) llvm-svn: 155825
* Don't introduce illegal types when creating vmull operations. <rdar://11324364>Bob Wilson2012-04-301-0/+74
| | | | | | | | ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. llvm-svn: 155824
OpenPOWER on IntegriCloud