summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* remove the partial specialization pass. It is unmaintained and has bugs.Chris Lattner2011-01-163-230/+0
| | | | llvm-svn: 123554
* Archive: Replace all internal uses of PathV1 with PathV2. The external API ↵Michael J. Spencer2011-01-151-36/+36
| | | | | | still uses PathV1. llvm-svn: 123551
* Add an assert so we don't silently miscompile ctpop for bit widths > 128.Benjamin Kramer2011-01-151-0/+4
| | | | llvm-svn: 123549
* Support/PathV2: Add identify_magic.Michael J. Spencer2011-01-154-40/+40
| | | | llvm-svn: 123548
* Reimplement CTPOP legalization with the "best" algorithm fromBenjamin Kramer2011-01-151-18/+45
| | | | | | | | | | | | | http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter, especially when counting 64 bit population on a 32 bit target. I hope this is fast enough to replace Kernighan-style counting loops even when the input is rather sparse. llvm-svn: 123547
* Support/PathV2: Implement has_magic in terms of get_magic.Michael J. Spencer2011-01-151-26/+8
| | | | llvm-svn: 123545
* Support/PathV2: Implement get_magic.Michael J. Spencer2011-01-153-0/+74
| | | | llvm-svn: 123544
* Add missing whitespace.Nick Lewycky2011-01-151-2/+2
| | | | llvm-svn: 123543
* Make constmerge a two-pass algorithm so that it won't miss mergingNick Lewycky2011-01-151-4/+34
| | | | | | opporuntities. Fixes PR8978. llvm-svn: 123541
* Try to unbreak selfhost.Benjamin Kramer2011-01-151-0/+1
| | | | llvm-svn: 123537
* Add a cache that protects mergefunc's internals from more surprises in DenseSet.Nick Lewycky2011-01-151-5/+27
| | | | | | Also, replace tabs with spaces. Yes, it's 2011. llvm-svn: 123535
* Teach LazyValueInfo that allocas aren't NULL. Over all of llvm-test, this savesNick Lewycky2011-01-151-5/+27
| | | | | | | | | | | half a million non-local queries, each of which would otherwise have triggered a linear scan over a basic block. Also fix a fixme for memory intrinsics which dereference pointers. With this, we prove that a pointer is non-null because it was dereferenced by an intrinsic 112 times in llvm-test. llvm-svn: 123533
* Allow unnamed_addr on declarations.Rafael Espindola2011-01-153-12/+7
| | | | llvm-svn: 123529
* temporarily revert r123526. While working on a follow-on patch IChris Lattner2011-01-151-3/+0
| | | | | | realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527
* fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient codeChris Lattner2011-01-151-0/+3
| | | | | | | | | The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526
* simplify code, no functionality change.Chris Lattner2011-01-151-30/+37
| | | | llvm-svn: 123525
* Now that instruction optzns can update the iterator as they go, we can Chris Lattner2011-01-151-10/+16
| | | | | | | | have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524
* make the current instruction iterator an ivar, allowing xforms thatChris Lattner2011-01-151-35/+38
| | | | | | | potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523
* implement an instcombine xform that canonicalizes casts outside of ↵Chris Lattner2011-01-151-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520
* one more instcombine variant that is needed to work with future changes,Chris Lattner2011-01-151-0/+9
| | | | | | no functionality change currently. llvm-svn: 123517
* fix typoChris Lattner2011-01-151-1/+1
| | | | llvm-svn: 123516
* Catch ~x < cst just like ~x < ~y, we currently handle this throughChris Lattner2011-01-151-4/+8
| | | | | | means that are about to disappear. llvm-svn: 123515
* reduce indentationChris Lattner2011-01-151-29/+29
| | | | llvm-svn: 123514
* 80-col.Eric Christopher2011-01-151-2/+4
| | | | llvm-svn: 123505
* Generalize LoadAndStorePromoter a bit and switch LICMChris Lattner2011-01-153-190/+111
| | | | | | to use it. llvm-svn: 123501
* Fix a comment.Bob Wilson2011-01-151-2/+2
| | | | llvm-svn: 123497
* Fix 80-cols.Eric Christopher2011-01-141-7/+14
| | | | llvm-svn: 123494
* Update CMake build.Ted Kremenek2011-01-141-0/+2
| | | | llvm-svn: 123491
* 'HiReg' is written but never read. Nuke itsTed Kremenek2011-01-141-5/+5
| | | | | | | | declaration and its assignments. Found by clang static analyzer. llvm-svn: 123486
* Fix a false-positive warning.Owen Anderson2011-01-141-1/+3
| | | | llvm-svn: 123480
* Delete an assignment to ThisBB which isn't needed, and tidy up someDan Gohman2011-01-141-4/+6
| | | | | | comments. llvm-svn: 123479
* Enhance GlobalOpt to be able evaluate initializers that involve stores throughOwen Anderson2011-01-141-2/+49
| | | | | | bitcasts, at least in simple cases. This fixes clang's CodeGenCXX/virtual-base-dtor.cpp llvm-svn: 123477
* Add a possibility to switch between CFI directives- and table-based frame ↵Anton Korobeynikov2011-01-147-17/+23
| | | | | | description emission. Currently all the backends use table-based stuff. llvm-svn: 123476
* CleanupAnton Korobeynikov2011-01-141-6/+1
| | | | llvm-svn: 123475
* Add CFI directives-based frame information emission. Not hooked yet.Anton Korobeynikov2011-01-143-0/+209
| | | | llvm-svn: 123474
* Split stuff as a preparation for CFI directives-based frame information emissionAnton Korobeynikov2011-01-144-356/+440
| | | | llvm-svn: 123473
* Use common style for .cfi directivesAnton Korobeynikov2011-01-141-7/+7
| | | | llvm-svn: 123472
* Support for precise scheduling of the instruction selection DAG,Andrew Trick2011-01-141-537/+663
| | | | | | | | | | | | | | | | | | | | | | | | | disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468
* switch SRoA to use LoadAndStorePromoter instead of its own copy of the code.Chris Lattner2011-01-141-136/+26
| | | | llvm-svn: 123457
* Add a new LoadAndStorePromoter class, which implements the generalChris Lattner2011-01-141-0/+154
| | | | | | | "promote a bunch of load and stores" logic, allowing the code to be shared and reused. llvm-svn: 123456
* Turn X-(X-Y) into Y. According to my auto-simplifier this is the most commonDuncan Sands2011-01-141-1/+15
| | | | | | | | | simplification present in fully optimized code (I think instcombine fails to transform some of these when "X-Y" has more than one use). Fires here and there all over the test-suite, for example it eliminates 8 subtractions in the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc. llvm-svn: 123442
* Factorize common code out of the InstructionSimplify shift logic. Add inDuncan Sands2011-01-141-62/+38
| | | | | | | | | | | threading of shifts over selects and phis while there. This fires here and there in the testsuite, to not much effect. For example when compiling spirit it fires 5 times, during early-cse, resulting in 6 more cse simplifications, and 3 more terminators being folded by jump threading, but the final bitcode doesn't change in any interesting way: other optimizations would have caught the opportunity anyway, only later. llvm-svn: 123441
* split SROA into two passes: one that uses DomFrontiers (-scalarrepl) Chris Lattner2011-01-142-27/+57
| | | | | | and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436
* Remove casts between Value** and Constant**, which won't work if aJay Foad2011-01-144-31/+67
| | | | | | | static_cast from Constant* to Value* has to adjust the "this" pointer. This is groundwork for PR889. llvm-svn: 123435
* Implement full support for promoting allocas to registers using SSAUpdaterChris Lattner2011-01-141-5/+162
| | | | | | | | | | | instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434
* Try for the third time to teach getFirstTerminator() about debug values.Jakob Stoklund Olesen2011-01-142-4/+11
| | | | | | This time let's rephrase to trick gcc-4.3 into not miscompiling. llvm-svn: 123432
* revert my fastisel patch again which apparently still gives theChris Lattner2011-01-141-1/+1
| | | | | | llvm-gcc-i386-linux-selfhost buildbot heartburn... llvm-svn: 123431
* reapply r123414 now that the botz are calmed down and the fix is already in.Chris Lattner2011-01-141-1/+1
| | | | llvm-svn: 123427
* indentationChris Lattner2011-01-141-1/+1
| | | | llvm-svn: 123426
* Completed :lower16: / :upper16: support for movw / movt pairs on Darwin.Evan Cheng2011-01-145-37/+212
| | | | | | | | - Fixed :upper16: fix up routine. It should be shifting down the top 16 bits first. - Added support for Thumb2 :lower16: and :upper16: fix up. - Added :upper16: and :lower16: relocation support to mach-o object writer. llvm-svn: 123424
OpenPOWER on IntegriCloud