summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* X86: optimization for -(x != 0)Manman Ren2012-05-072-0/+22
| | | | | | | | | | | | | | | | | This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td: def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>; rdar: 10961709 llvm-svn: 156312
* Add support for the 'x' constraint.Eric Christopher2012-05-071-1/+8
| | | | | | Patch by Jack Carter. llvm-svn: 156295
* Add support for the 'l' constraint.Eric Christopher2012-05-071-0/+7
| | | | | | Patch by Jack Carter. llvm-svn: 156294
* Add support for the 'c' constraint.Eric Christopher2012-05-071-1/+13
| | | | | | Patch by Jack Carter. llvm-svn: 156293
* Add support for the 'P' constraint.Eric Christopher2012-05-071-0/+11
| | | | | | Patch by Jack Carter. llvm-svn: 156292
* Fix some issues in the f16c instructions.Craig Topper2012-05-071-11/+9
| | | | llvm-svn: 156287
* Add support for the 'O' constraint.Eric Christopher2012-05-071-0/+11
| | | | | | Patch by Jack Carter. llvm-svn: 156285
* Add support for the 'N' inline asm constraint.Eric Christopher2012-05-071-0/+11
| | | | | | Patch by Jack Carter. llvm-svn: 156284
* Add support for the 'L' inline asm constraint.Eric Christopher2012-05-071-0/+11
| | | | | | Patch by Jack Carter. llvm-svn: 156283
* Add support for the inline asm constraint 'K'.Eric Christopher2012-05-071-0/+11
| | | | llvm-svn: 156282
* Add SSE4A MOVNTSS/MOVNTSD instructions.Craig Topper2012-05-071-0/+14
| | | | llvm-svn: 156281
* Support the 'J' constraint.Eric Christopher2012-05-071-0/+11
| | | | | | Patch by Jack Carter. llvm-svn: 156280
* Add support for the 'I' inline asm constraint. Also add testsEric Christopher2012-05-072-0/+48
| | | | | | | | from the previous 2 patches. Patch by Jack Carter. llvm-svn: 156279
* Allow 64 bit integer values in gpu registers if arch and abi are 64 bit.Eric Christopher2012-05-071-2/+4
| | | | | | Patch by Jack Carter. llvm-svn: 156278
* When using inline asm constraints representingEric Christopher2012-05-071-1/+1
| | | | | | | | | non-floating point general registers allow 8 and 16-bit elements. Patch by Jack Carter. llvm-svn: 156277
* Use MVT instead of EVT as the argument to all the shuffle decode functions. ↵Craig Topper2012-05-063-35/+33
| | | | | | Simplify some of the decode functions. llvm-svn: 156268
* Add VPERMQ/VPERMPD to the list of target specific shuffles that can be ↵Craig Topper2012-05-061-0/+6
| | | | | | looked through for DAG combine purposes. llvm-svn: 156266
* Add shuffle decode support for VPERMQ/VPERMPD.Craig Topper2012-05-063-0/+22
| | | | llvm-svn: 156265
* make SourceMgr tolerate empty SMLoc()'s better.Chris Lattner2012-05-061-45/+54
| | | | llvm-svn: 156260
* Switch the select to branch transformation on by default.Benjamin Kramer2012-05-061-3/+4
| | | | | | | | | The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
* Remove trailing spaces.Jakub Staszak2012-05-061-60/+60
| | | | llvm-svn: 156257
* Unix/Process.inc: Give more useful random seed to srand. Workaround for PR12743.NAKAMURA Takumi2012-05-061-1/+14
| | | | llvm-svn: 156252
* Support/Process: Move llvm::sys::Process::GetRandomNumber() from Process.cpp ↵NAKAMURA Takumi2012-05-062-10/+9
| | | | | | | to Unix/Process.inc. FIXME: GetRandomNumber() is not implemented in Win32. llvm-svn: 156251
* reapply my patch, with a fix for an off-by-one error. Turned out to be a lotChris Lattner2012-05-051-12/+18
| | | | | | of work for a drive-by fix :) llvm-svn: 156246
* revert my patches, which are causing problems.Chris Lattner2012-05-051-18/+12
| | | | llvm-svn: 156245
* refactor some code to expose column numbers more and make diagnostic ↵Chris Lattner2012-05-051-12/+18
| | | | | | printing slightly more efficient. llvm-svn: 156243
* Nuke a few dead remnants of the CBE.Jim Grosbach2012-05-053-46/+0
| | | | llvm-svn: 156241
* [Support] Add missing include.Daniel Dunbar2012-05-051-1/+2
| | | | llvm-svn: 156240
* [Support] Fix up comments.Daniel Dunbar2012-05-051-5/+3
| | | | llvm-svn: 156239
* [Support] Rewrite sys::fs::unique_file to not be stupid with /dev/urandom.Daniel Dunbar2012-05-051-19/+5
| | | | | | | | - Just use sys::Process::GetRandomNumber instead of having two poor implementations. - This is ~70 times (!) faster on my OS X machine. llvm-svn: 156238
* [Support] Add sys::Process::GetRandomNumber().Daniel Dunbar2012-05-051-0/+9
| | | | | | - Primitive API, but we rarely have need for random numbers. llvm-svn: 156237
* CodeGenPrepare: Add a transform to turn selects into branches in some cases.Benjamin Kramer2012-05-051-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
* Add a new target hook "predictableSelectIsExpensive".Benjamin Kramer2012-05-053-0/+7
| | | | | | | | | | | This will be used to determine whether it's profitable to turn a select into a branch when the branch is likely to be predicted. Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM. I'm not entirely happy with the name of this flag, suggestions welcome ;) llvm-svn: 156233
* NVPTX: Initialize the UseF32FTZ flag.Benjamin Kramer2012-05-051-0/+2
| | | | llvm-svn: 156232
* Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for ↵Stepan Dyatkovskiy2012-05-051-1/+1
| | | | | | | | case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231
* Typo.Eric Christopher2012-05-051-1/+1
| | | | llvm-svn: 156226
* Make sure findRepresentativeClass picks the widest super-register.Jakob Stoklund Olesen2012-05-041-6/+10
| | | | | | | | We want the representative register class to contain the largest super-registers available. This makes the function less sensitive to the register class numbering. llvm-svn: 156220
* Remove extra comma in debug output.Jakob Stoklund Olesen2012-05-041-1/+1
| | | | llvm-svn: 156219
* Fix warnings in release build.David Blaikie2012-05-042-1/+2
| | | | | | | | | This fixes a couple of Clang warnings in release builds of LLVM: * Missing return in ISelLowering * Unused variable in NVPTXutil.cpp llvm-svn: 156216
* Tweak to the fix in r156212, as with the change in removing the shift theKevin Enderby2012-05-041-1/+1
| | | | | | SignExtend32<22>(Val<<1) also needs to change to SignExtend32<21>(Val) . llvm-svn: 156213
* Fix a bug in the ARM disassembler for wide branch conditional instructionsKevin Enderby2012-05-041-1/+1
| | | | | | | where the symbolic operand's displacement was incorrectly shifted left by 1. rdar://11387046 llvm-svn: 156212
* Fix a Clang warning in the new NVPTX backend:Chandler Carruth2012-05-041-1/+1
| | | | | | | | | | | | | | In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53: ../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default] default: assert(0 && "Unknown condition code"); ^ 1 warning generated. The prevailing pattern in LLVM is to not use a default label, and instead to use llvm_unreachable to denote that the switch in fact covers all return paths from the function. llvm-svn: 156209
* Teach the code extractor how to extract a sequence of blocks fromChandler Carruth2012-05-041-7/+32
| | | | | | | RegionInfo's RegionNode. This mirrors the logic for automating the extraction from a Loop. llvm-svn: 156208
* Rename the Region::block_iterator to Region::block_node_iterator, andChandler Carruth2012-05-043-12/+29
| | | | | | | | | | | | | | | | | | | | | | | | add a new Region::block_iterator which actually iterates over the basic blocks of the region. The old iterator, now call 'block_node_iterator' iterates over RegionNodes which contain a single basic block. This works well with the GraphTraits-based iterator design, however most users actually want an iterator over the BasicBlocks inside these RegionNodes. Now the 'block_iterator' is a wrapper which exposes exactly this interface. Internally it uses the block_node_iterator to walk all nodes which are single basic blocks, but transparently unwraps the basic block to make user code simpler. While this patch is a bit of a wash, most of the updates are to internal users, not external users of the RegionInfo. I have an accompanying patch to Polly that is a strict simplification of every user of this interface, and I'm working on a pass that also wants the same simplified interface. This patch alone should have no functional impact. llvm-svn: 156202
* This patch adds a new NVPTX back-end to LLVM which supports code generation ↵Justin Holewinski2012-05-0461-1/+23002
| | | | | | | | | | | | | | | | | for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB llvm-svn: 156196
* Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits ↵Sebastian Pop2012-05-041-0/+1
| | | | | | 16-bits encoding of CMN instructions. llvm-svn: 156195
* Adds Intel Atom scheduling latencies to X86InstrSystem.td.Preston Gurd2012-05-043-139/+272
| | | | llvm-svn: 156194
* Pacify GCC's -Wreturn-typeMatt Beaumont-Gay2012-05-041-0/+1
| | | | llvm-svn: 156189
* Factor the computation of input and output sets into a public interfaceChandler Carruth2012-05-041-35/+34
| | | | | | | | | | | | | | | | of the CodeExtractor utility. This allows speculatively computing input and output sets to measure the likely size impact of the code extraction. These sets cannot be reused sadly -- we mutate the function prior to forming the final sets used by the actual extraction. The interface has been revamped slightly to make it easier to use correctly by making the interface const and sinking the computation of the number of exit blocks into the full extraction function and away from the rest of this logic which just computed two output parameters. llvm-svn: 156168
* Rather than trying to gracefully handle input sequences with repeatedChandler Carruth2012-05-041-1/+1
| | | | | | | blocks, assert that this doesn't happen. We don't want to bother trying to support this call pattern as it isn't necessary. llvm-svn: 156167
OpenPOWER on IntegriCloud