bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Several fixes and enhancements to the PPC32 backend.	Nate Begeman	2004-10-07	3	-147/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Fix an illegal argument to getClassB when deciding whether or not to sign extend a byte load. 2. Initial addition of isLoad and isStore flags to the instruction .td file for eventual use in a scheduler. 3. Rewrite of how constants are handled in emitSimpleBinaryOperation so that we can emit the PowerPC shifted immediate instructions far more often. This allows us to emit the following code: int foo(int x) { return x \| 0x00F0000; } _foo: .LBB_foo_0: ; entry ; IMPLICIT_DEF oris r3, r3, 15 blr llvm-svn: 16826
*	Add ori reg, reg, 0 as a move instruction. This can be generated from	Nate Begeman	2004-10-07	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	loading a 32bit constant into a register whose low halfword is all zeroes. We now omit the ori after the lis for the following C code: int bar(int y) { return y * 0x00F0000; } _bar: .LBB_bar_0: ; entry ; IMPLICIT_DEF lis r2, 15 mullw r3, r3, r2 blr llvm-svn: 16825
*	Remove unnecessary header include	Nate Begeman	2004-10-07	1	-1/+0
\| \| \| \|	llvm-svn: 16824
*	Improve comments, no functionality changes	Chris Lattner	2004-10-07	1	-18/+53
\| \| \| \|	llvm-svn: 16814
*	Fix a nasty dangling pointer problem, due to a free'd pointer being left in	Chris Lattner	2004-10-07	1	-0/+3
\| \| \| \| \| \| \|	a map. This caused problems if a later object happened to be allocated at the free'd object's address. llvm-svn: 16813
*	Unfortunately the fix for the previous bug introduced the previous	Chris Lattner	2004-10-07	1	-41/+67
\| \| \| \| \| \| \| \| \| \|	exponential behavior (bork!). This patch processes stuff with an explicit SCC finder, allowing the algorithm to be more clear, efficient, and also (as a bonus) correct! This gets us back to taking 0.6s to disassemble my horrible .bc file that previously took something > 30 mins. llvm-svn: 16811
*	Fix a bug in my previous change. Unfortunately this reverts most of the	Chris Lattner	2004-10-07	1	-3/+4
\| \| \| \| \| \|	speedup, but has the advantage of not breaking a bunch of programs! llvm-svn: 16806
*	Fix a bug in the safety analysis routine	Chris Lattner	2004-10-07	1	-3/+3
\| \| \| \|	llvm-svn: 16804
*	Comment cleanups	Chris Lattner	2004-10-07	1	-4/+1
\| \| \| \|	llvm-svn: 16803
*	* Rename pass to globalopt, since we do more than just constify	Chris Lattner	2004-10-07	2	-147/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Instead of handling dead functions specially, just nuke them. * Be more aggressive about cleaning up after constification, in particular, handle getelementptr instructions and constantexprs. * Be a little bit more structured about how we process globals. *** Delete globals that are only stored to, and never read. These are clearly not useful, so they should go. This implements deadglobal.llx This last one triggers quite a few times. In particular, 2208 in the external tests, 1865 of which are in 252.eon. This shrinks eon from 1995094 to 1732341 bytes of bytecode. llvm-svn: 16802
*	Implement GlobalConstifier/trivialstore.llx, and also do some	Chris Lattner	2004-10-06	1	-3/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	simplifications of the resultant program to avoid making later passes do it all. This allows us to constify globals that just have the same constant that they are initialized stored into them. Suprisingly this comes up ALL of the freaking time, dozens of times in SPEC, 30 times in vortex alone. For example, on 256.bzip2, it allows us to constify these two globals: %smallMode = internal global ubyte 0 ; <ubyte> [#uses=8] %verbosity = internal global int 0 ; <int> [#uses=49] Which (with later optimizations) results in the bytecode file shrinking from 82286 to 69686 bytes! Lets hear it for IPO :) For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }" code. llvm-svn: 16793
*	Dont' let null nodes sneak past cast instructions	Chris Lattner	2004-10-06	1	-1/+4
\| \| \| \|	llvm-svn: 16779
*	Change Type::isAbstract to have better comments, a more correct name	Chris Lattner	2004-10-06	1	-12/+19
\| \| \| \| \| \| \| \| \| \| \| \|	(PromoteAbstractToConcrete), and to use a set to avoid recomputation. In particular, this set eliminates the potentially exponential cases from this little recursive algorithm. On a particularly nasty testcase, llvm-dis on the .bc file went from 34 minutes (which is when I killed it, it still hadn't finished) to 0.57s. Remember kids, exponential algorithms are bad. llvm-svn: 16772
*	Correct some typeos	Chris Lattner	2004-10-06	1	-3/+3
\| \| \| \|	llvm-svn: 16770
*	Instcombine: -(X sdiv C) -> (X sdiv -C), tested by sub.ll:test16	Chris Lattner	2004-10-06	1	-0/+8
\| \| \| \|	llvm-svn: 16769
*	Remove debugging code, fix encoding problem. This fixes the problems	Chris Lattner	2004-10-06	2	-3/+2
\| \| \| \| \| \|	the JIT had last night. llvm-svn: 16766
*	Turning on fsel code gen now that we can do so would be good.	Nate Begeman	2004-10-06	1	-11/+10
\| \| \| \|	llvm-svn: 16765
*	Implement floating point select for lt, gt, le, ge using the powerpc fsel	Nate Begeman	2004-10-06	1	-25/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction. Now, rather than emitting the following loop out of bisect: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f4 bge .LBB_main_64 ; no_exit.0.i .LBB_main_63: ; no_exit.0.i b .LBB_main_65 ; no_exit.0.i .LBB_main_64: ; no_exit.0.i fmr f2, f1 .LBB_main_65: ; no_exit.0.i addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f5, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f5 bge .LBB_main_67 ; no_exit.0.i .LBB_main_66: ; no_exit.0.i b .LBB_main_68 ; no_exit.0.i .LBB_main_67: ; no_exit.0.i fmr f4, f1 .LBB_main_68: ; no_exit.0.i fadd f1, f2, f4 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fcmpu cr0, f4, f0 bgt .LBB_main_70 ; no_exit.0.i .LBB_main_69: ; no_exit.0.i b .LBB_main_71 ; no_exit.0.i .LBB_main_70: ; no_exit.0.i fmr f0, f4 .LBB_main_71: ; no_exit.0.i fsub f1, f2, f1 addi r2, r2, -1 fcmpu cr0, f1, f3 blt .LBB_main_73 ; no_exit.0.i .LBB_main_72: ; no_exit.0.i b .LBB_main_74 ; no_exit.0.i .LBB_main_73: ; no_exit.0.i fmr f3, f1 .LBB_main_74: ; no_exit.0.i cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i We emit this instead: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 fsel f1, f1, f1, f2 addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f2, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f2 fsel f2, f2, f2, f4 fadd f1, f1, f2 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fsub f5, f0, f4 fsel f0, f5, f0, f4 fsub f1, f2, f1 addi r2, r2, -1 fsub f2, f1, f3 fsel f3, f2, f3, f1 cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i llvm-svn: 16764
*	Codegen signed mod by 2 or -2 more efficiently. Instead of generating:	Chris Lattner	2004-10-06	1	-3/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	t: mov %EDX, DWORD PTR [%ESP + 4] mov %ECX, 2 mov %EAX, %EDX sar %EDX, 31 idiv %ECX mov %EAX, %EDX ret Generate: t: mov %ECX, DWORD PTR [%ESP + 4] * mov %EAX, %ECX cdq and %ECX, 1 xor %ECX, %EDX sub %ECX, %EDX * mov %EAX, %ECX ret Note that the two marked moves are redundant, and should be eliminated by the register allocator, but aren't. Compare this to GCC, which generates: t: mov %eax, DWORD PTR [%esp+4] mov %edx, %eax shr %edx, 31 lea %ecx, [%edx+%eax] and %ecx, -2 sub %eax, %ecx ret or ICC 8.0, which generates: t: movl 4(%esp), %ecx #3.5 movl $-2147483647, %eax #3.25 imull %ecx #3.25 movl %ecx, %eax #3.25 sarl $31, %eax #3.25 addl %ecx, %edx #3.25 subl %edx, %eax #3.25 addl %eax, %eax #3.25 negl %eax #3.25 subl %eax, %ecx #3.25 movl %ecx, %eax #3.25 ret #3.25 We would be in great shape if not for the moves. llvm-svn: 16763
*	Really fix FreeBSD, which apparently doesn't tolerate the extern.	Chris Lattner	2004-10-06	1	-2/+3
\| \| \| \| \| \|	Thanks to Jeff Cohen for pointing out my goof. llvm-svn: 16762
*	Fix a scary bug with signed division by a power of two. We used to generate:	Chris Lattner	2004-10-06	1	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	s: ;; X / 4 mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 mov %EDX, %EAX add %EDX, %ECX sar %EAX, 2 ret When we really meant: s: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 add %EAX, %ECX sar %EAX, 2 ret Hey, this also reduces register pressure too :) llvm-svn: 16761
*	Codegen signed divides by 2 and -2 more efficiently. In particular	Chris Lattner	2004-10-06	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead of: s: ;; X / 2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax ret t: ;; X / -2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax negl %eax ret Emit: s: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax ret t: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax negl %eax ret llvm-svn: 16760
*	Add some new instructions. Fix the asm string for sbb32rr	Chris Lattner	2004-10-06	3	-3/+24
\| \| \| \|	llvm-svn: 16759
*	Reduce code growth implied by the tail duplication pass by not duplicating	Chris Lattner	2004-10-06	1	-0/+75
\| \| \| \| \| \| \|	an instruction if it can be hoisted to a common dominator of the block. This implements: test/Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 16758
*	FreeBSD uses GCC. Patch contributed by Jeff Cohen!	Chris Lattner	2004-10-06	1	-3/+2
\| \| \| \|	llvm-svn: 16756
*	Must include sys/stat.h before declaring a 'struct stat'	Brian Gaeke	2004-10-05	1	-0/+1
\| \| \| \|	llvm-svn: 16728
*	Make sure the const bit gets inherited correctly when linking declarations	Chris Lattner	2004-10-05	1	-1/+15
\| \| \| \| \| \| \|	of disagreeing constness. This fixes test/Regression/Linker/ConstantGlobals[123].ll llvm-svn: 16692
*	Adjust sys/stat.h inclusion so its only for SunOS.	Reid Spencer	2004-10-05	2	-1/+1
\| \| \| \|	llvm-svn: 16686
*	Added a couple of includes to get this to compile on Sparc.	Tanya Lattner	2004-10-05	2	-1/+2
\| \| \| \|	llvm-svn: 16685
*	Solaris doesn't have MAP_FILE.	Chris Lattner	2004-10-05	1	-1/+4
\| \| \| \|	llvm-svn: 16682
*	Excise the ill-advised RLCOMP compression algorithm and simply leave the	Reid Spencer	2004-10-04	1	-159/+20
\| \| \| \| \| \| \| \|	previously temporary NULLCOMP implementation that merely copies the data verbatim without compression. Also, don't warn if there's no compression library as that is taken care of during configuration time. llvm-svn: 16654
*	Add a context for the callback so different compression scenarios can be	Reid Spencer	2004-10-04	1	-18/+18
\| \| \| \| \| \|	distinguished. Tidy up documentation. Thanks, Chris. llvm-svn: 16652
*	Fix build if not HAVE_BZIP2	Chris Lattner	2004-10-04	1	-1/+1
\| \| \| \|	llvm-svn: 16650
*	First version of the MappedFile abstraction for operating system idependent	Reid Spencer	2004-10-04	10	-0/+330
\| \| \| \| \| \| \| \|	mapping of files. This first version uses mmap where its available. The class needs to implement an alternate mechanism based on malloc'd memory and file reading/writing for platforms without virtual memory. llvm-svn: 16649
*	First version of a support utility to provide generalized compression in	Reid Spencer	2004-10-04	1	-0/+526
\| \| \| \| \| \|	LLVM that handles availability and unavailability of bzip2 and zlib. llvm-svn: 16648
*	* Prune #includes	Chris Lattner	2004-10-04	1	-101/+27
\| \| \| \| \| \| \| \|	* Update comments * Rearrange code a bit * Finally ELIMINATE the GAS workaround emitter for Intel mode. woot! llvm-svn: 16647
*	Add support for emitting AT&T style .s files, and make it the default. Users	Chris Lattner	2004-10-04	1	-128/+307
\| \| \| \| \| \|	may now choose their output format with the -x86-asm-syntax={intel\|att} flag. llvm-svn: 16646
*	Convert some missed patterns to support AT&T style	Chris Lattner	2004-10-04	1	-8/+8
\| \| \| \|	llvm-svn: 16645
*	Apparently the GNU assembler has a HUGE hack to be compatible with really	Chris Lattner	2004-10-04	1	-9/+12
\| \| \| \| \| \| \| \| \|	old and broken AT&T syntax assemblers. The problem with this hack is that SOME forms of the fdiv and fsub instructions have the 'r' bit inverted. This was a real pain to figure out, but is trivially easy to support: thus we are now bug compatible with gas and gcc. llvm-svn: 16644
*	Fix incorrect suffix	Chris Lattner	2004-10-04	1	-1/+1
\| \| \| \|	llvm-svn: 16642
*	Fix some more missed suffixes and swapped operands	Chris Lattner	2004-10-04	1	-34/+40
\| \| \| \|	llvm-svn: 16641
*	Add missing suffixes to FP instructions for AT&T mode	Chris Lattner	2004-10-04	1	-38/+33
\| \| \| \|	llvm-svn: 16640
*	Add support for the -x86-asm-syntax flag, which can be used to choose between	Chris Lattner	2004-10-03	3	-14/+48
\| \| \| \| \| \| \| \|	Intel and AT&T style assembly language. The ultimate goal of this is to eliminate the GasBugWorkaroundEmitter class, but for now AT&T style emission is not fully operational. llvm-svn: 16639
*	Add support to the instruction patterns for AT&T style output, which will	Chris Lattner	2004-10-03	1	-569/+963
\| \| \| \| \| \| \| \| \|	hopefully lead to the death of the 'GasBugWorkaroundEmitter'. This also includes changes to wrap the whole file to 80 columns! Woot! :) Note that the AT&T style output has not been tested at all. llvm-svn: 16638
*	Add initial support for variants	Chris Lattner	2004-10-03	1	-2/+10
\| \| \| \|	llvm-svn: 16635
*	Do not repeat the map lookup	Chris Lattner	2004-10-01	1	-1/+1
\| \| \| \|	llvm-svn: 16633
*	When a virtual register is folded into an instruction, keep track of whether	Chris Lattner	2004-10-01	3	-27/+52
\| \| \| \| \| \| \| \|	it was a use, def, or both. This allows us to be less pessimistic in our analysis of them. In practice, this doesn't make a big difference, but it doesn't hurt either. llvm-svn: 16632
*	Add a simple little improvement to the local spiller to keep track of stores	Chris Lattner	2004-10-01	1	-0/+26
\| \| \| \| \| \| \| \| \| \|	and delete them if they turn out to be dead. This is a useful little hack that even speeds up some programs. For example, it speeds up Ptrdist/ks from 17.53s to 15.59s, and 188.ammp from 149s to 146s. This also speeds up llc :) llvm-svn: 16630
*	Substantially revamp the local spiller, causing it to actually improve the	Chris Lattner	2004-10-01	1	-164/+301
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	generated code over the simple spiller. The new local spiller generates substantially better code than the simple one in some cases, by reusing values that are loaded out of stack slots and kept available in registers. This primarily helps programs that are spilling a lot, and there is still stuff that can be done to improve it. This patch makes the local spiller the default, as it's only a tiny bit slower than the simple spiller (it increases the runtime of llc by < 1%). Here are some numbers with speedups. Program #reuse old(s) new(s) Speedup Povray: 3452, 16.87 -> 15.93 (5.5%) 177.mesa: 2176, 2.77 -> 2.76 (0%) 179.art: 35, 28.43 -> 28.01 (1.5%) 183.equake: 55, 61.44 -> 61.41 (0%) 188.ammp: 869, 174 -> 149 (15%) 164.gzip: 43, 40.73 -> 40.71 (0%) 175.vpr: 351, 18.54 -> 17.34 (6.5%) 176.gcc: 2471, 5.01 -> 4.92 (1.8%) 181.mcf 42, 79.30 -> 75.20 (5.2%) 186.crafty: 484, 29.73 -> 30.04 (-1%) 197.parser: 251, 10.47 -> 10.67 (-1%) 252.eon: 1501, 1.98 -> 1.75 (12%) 253.perlbm: 1183, 14.83 -> 14.42 (2.8%) 254.gap: 825, 7.46 -> 7.29 (2.3%) 255.vortex: 285, 10.51 -> 10.27 (2.3%) 256.bzip2: 63, 55.70 -> 55.20 (0.9%) 300.twolf: 830, 21.63 -> 22.00 (-1%) PtrDist/ks 14, 32.75 -> 17.53 (46.5%) Olden/tsp 46, 8.71 -> 8.24 (5.4%) Free/distray 70, 1.09 -> 0.99 (9.2%) llvm-svn: 16629
*	Pretty print a bit nicer :)	Chris Lattner	2004-10-01	1	-2/+1
\| \| \| \|	llvm-svn: 16628