summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Perform partial SROA on the helper hashing structure. I really wish theChandler Carruth2012-04-071-42/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | optimizers could do this for us, but expecting partial SROA of classes with template methods through cloning is probably expecting too much heroics. With this change, the begin/end pointer pairs which indicate the status of each loop iteration are actually passed directly into each layer of the combine_data calls, and the inliner has a chance to see when most of the combine_data function could be deleted by inlining. Similarly for 'length'. We have to be careful to limit the places where in/out reference parameters are used as those will also defeat the inliner / optimizers from properly propagating constants. With this change, LLVM is able to fully inline and unroll the hash computation of small sets of values, such as two or three pointers. These now decompose into essentially straight-line code with no loops or function calls. There is still one code quality problem to be solved with the hashing -- LLVM is failing to nuke the alloca. It removes all loads from the alloca, leaving only lifetime intrinsics and dead(!!) stores to the alloca. =/ Very unfortunate. llvm-svn: 154264
* Fix ValueTracking to conclude that debug intrinsics are safe toChandler Carruth2012-04-072-4/+52
| | | | | | | | | | | | | | | | | | speculate. Without this, loop rotate (among many other places) would suddenly stop working in the presence of debug info. I found this looking at loop rotate, and have augmented its tests with a reduction out of a very hot loop in yacr2 where failing to do this rotation costs sometimes more than 10% in runtime performance, perturbing numerous downstream optimizations. This should have no impact on performance without debug info, but the change in performance when debug info is enabled can be extreme. As a consequence (and this how I got to this yak) any profiling of performance problems should be treated with deep suspicion -- they may have been wildly innacurate of debug info was enabled for profiling. =/ Just a heads up. llvm-svn: 154263
* SCEV: When expanding a GEP the final addition to the base pointer has NUW ↵Benjamin Kramer2012-04-073-6/+6
| | | | | | | | but not NSW. Found by inspection. llvm-svn: 154262
* Fix Thumb __builtin_longjmp with integrated assembler. <rdar://problem/11203543>Bob Wilson2012-04-071-2/+2
| | | | | | | | | | | | | | | | | The tLDRr instruction with the last register operand set to the zero register prints in assembly as if no register was specified, and the assembler encodes it as a tLDRi instruction with a zero immediate. With the integrated assembler, that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which is broken. Emit the instruction as tLDRi with a zero immediate. I don't know if there's a good way to write a testcase for this. Suggestions welcome. Opportunities for follow-up work: 1) The asm printer should complain if a non-optional register operand is set to the zero register, instead of silently dropping it. 2) The integrated assembler should complain in the same situation, instead of silently emitting the operand as "r0". llvm-svn: 154261
* Refactor: Use positive field names in VectorizeConfig.Hongbin Zheng2012-04-072-25/+27
| | | | llvm-svn: 154249
* Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming.NAKAMURA Takumi2012-04-072-0/+4
| | | | | | | Cygwin-1.7 supports dw2. Some recent mingw distros support one, too. I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin. llvm-svn: 154247
* Make the test for r154235 more platform-independent with a shorterAlexis Hunt2012-04-071-1/+1
| | | | | | string. llvm-svn: 154243
* Output UTF-8-encoded characters as identifier characters into assemblyAlexis Hunt2012-04-074-4/+19
| | | | | | | | | | | | | | by default. This is a behaviour configurable in the MCAsmInfo. I've decided to turn it on by default in (possibly optimistic) hopes that most assemblers are reasonably sane. If this proves a problem, switching to default seems reasonable. I'm not sure if this is the opportune place to test, but it seemed good to make sure it was tested somewhere. llvm-svn: 154235
* Tidy up. 80 columns.Jim Grosbach2012-04-065-5/+9
| | | | llvm-svn: 154226
* ARMPat is equivalent to Requires<[IsARM]>.Jakob Stoklund Olesen2012-04-061-3/+2
| | | | llvm-svn: 154210
* Eliminate iOS-specific tail call instructions.Jakob Stoklund Olesen2012-04-063-75/+27
| | | | | | | After register masks were introdruced to represent the call clobbers, it is no longer necessary to have duplicate instruction for iOS. llvm-svn: 154209
* Add lines in global-address.ll to test N32 and N64 code generation.Akira Hatanaka2012-04-061-0/+4
| | | | llvm-svn: 154202
* There is no portable std::abs overload for int64_t, use the llvm::abs64Chandler Carruth2012-04-061-2/+2
| | | | | | which exists for this purpose. llvm-svn: 154199
* Fixed two leaks in the MC disassembler. The MCSean Callanan2012-04-062-1/+13
| | | | | | | | | | | | | | | disassembler requires a MCSubtargetInfo and a MCInstrInfo to exist in order to initialize the instruction printer and disassembler; however, although the printer and disassembler keep references to these objects they do not own them. Previously, the MCSubtargetInfo and MCInstrInfo objects were just leaked. I have extended LLVMDisasmContext to own these objects and delete them when it is destroyed. llvm-svn: 154192
* Allow negative immediates in ARM and Thumb2 compares.Jakob Stoklund Olesen2012-04-062-2/+37
| | | | | | | ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183
* Reintroduce InlineCostAnalyzer::getInlineCost() variant with explicit calleeDavid Chisnall2012-04-062-1/+13
| | | | | | | | parameter until we have a more sensible API for doing the same thing. Reviewed by Chandler. llvm-svn: 154180
* Sink the collection of return instructions until after *all*Chandler Carruth2012-04-062-7/+46
| | | | | | | | | | | simplification has been performed. This is a bit less efficient (requires another ilist walk of the basic blocks) but shouldn't matter in practice. More importantly, it's just too much work to keep track of all the various ways the return instructions can be mutated while simplifying them. This fixes yet another crasher, reported by Daniel Dunbar. llvm-svn: 154179
* Tweak this test to ensure the inliner did indeed fire. Thanks to RichardChandler Carruth2012-04-061-0/+1
| | | | | | Smith for pointing this out in review. llvm-svn: 154178
* Make GVN's propagateEquality non-recursive. No intended functionality change.Duncan Sands2012-04-061-98/+105
| | | | | | The modifications are a lot more trivial than they appear to be in the diff! llvm-svn: 154174
* Test case for PR12413Craig Topper2012-04-061-0/+15
| | | | llvm-svn: 154172
* Fix narrowing conversion.Benjamin Kramer2012-04-061-1/+1
| | | | llvm-svn: 154171
* DenseMap: Perform the pod-like object optimization when the value type is ↵Benjamin Kramer2012-04-064-30/+26
| | | | | | | | POD-like, not the DenseMapInfo for it. Purge now unused template arguments. This has been broken since r91421. Patch by Lubos Lunak! llvm-svn: 154170
* Allow 256-bit shuffles to be split if a 128-bit lane contains elements from ↵Craig Topper2012-04-062-73/+57
| | | | | | a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413. llvm-svn: 154166
* Add the tests that were supposed to go with r153935 that I forgot svn addCraig Topper2012-04-062-0/+73
| | | | llvm-svn: 154165
* Actually finish this sentence in the comment the way I intended. ThanksChandler Carruth2012-04-061-1/+1
| | | | | | Matt for pointing this out. llvm-svn: 154158
* Sink the return instruction collection until after we're done deletingChandler Carruth2012-04-062-7/+46
| | | | | | | | | | | | | | dead code, including dead return instructions in some cases. Otherwise, we end up having a bogus poniter to a return instruction that blows up much further down the road. It turns out that this pattern is both simpler to code, easier to update in the face of enhancements to the inliner cleanup, and likely cheaper given that it won't add dead instructions to the list. Thanks to John Regehr's numerous test cases for teasing this out. llvm-svn: 154157
* Deduplicate ARM call-related instructions.Jakob Stoklund Olesen2012-04-066-145/+24
| | | | | | | | We had special instructions for iOS because r9 is call-clobbered, but that is represented dynamically by the register mask operands now, so there is no need for the pseudo-instructions. llvm-svn: 154144
* ARM: Don't form a t2LDRi8 or t2STRi8 with an offset of zero.Jim Grosbach2012-04-051-0/+8
| | | | | | | | | | | | | | | | | The load/store optimizer splits LDRD/STRD into two instructions when the register pairing doesn't work out. For negative offsets in Thumb2, it uses t2STRi8 to do that. That's fine, except for the case when the offset is in the range [-4,-1]. In that case, we'll also form a second t2STRi8 with the original offset plus 4, resulting in a t2STRi8 with a non-negative offset, which ends up as if it were an STRT, which is completely bogus. Similarly for loads. No testcase, unfortunately, as any I've been able to construct is both large and extremely fragile. rdar://11193937 llvm-svn: 154141
* Fix the build breakage introduced by r154131.Kaelyn Uhrain2012-04-051-19/+3
| | | | | | | | | The empty 1-argument operator delete is for the benefit of the destructor. A couple of spot checks of running yaml-bench under valgrind against a few of the files under test/YAMLParser did not reveal any leaks introduced by this change. llvm-svn: 154137
* Really fix -Wnon-virtual-dtor warnings; gcc needs the dtors to beKaelyn Uhrain2012-04-051-7/+7
| | | | | | explicitly marked as virtual. llvm-svn: 154131
* The internalize pass can be dangerous for LTO.Bill Wendling2012-04-051-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following program: $ cat main.c void foo(void) { } int main(int argc, char *argv[]) { foo(); return 0; } $ cat bundle.c extern void foo(void); void bar(void) { foo(); } $ clang -o main main.c $ clang -o bundle.so bundle.c -bundle -bundle_loader ./main $ nm -m bundle.so 0000000000000f40 (__TEXT,__text) external _bar (undefined) external _foo (from executable) (undefined) external dyld_stub_binder (from libSystem) $ clang -o main main.c -O4 $ clang -o bundle.so bundle.c -bundle -bundle_loader ./main Undefined symbols for architecture x86_64: "_foo", referenced from: _bar in bundle-elQN6d.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) The linker was told that the 'foo' in 'main' was 'internal' and had no uses, so it was dead stripped. Another situation is something like: define void @foo() { ret void } define void @bar() { call asm volatile "call _foo" ... ret void } The only use of 'foo' is inside of an inline ASM call. Since we don't look inside those for uses of functions, we don't specify this as a "use." Get around this by not invoking the 'internalize' pass by default. This is an admitted hack for LTO correctness. <rdar://problem/11185386> llvm-svn: 154124
* ARM assembly aliases for add negative immediates using sub.Jim Grosbach2012-04-054-5/+76
| | | | | | | | | | 'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out. Thumb1 aliases for adding a negative immediate to the stack pointer, also. rdar://11192734 llvm-svn: 154123
* Reapply test case in 154038, this time with triple to prevent the backendAkira Hatanaka2012-04-051-0/+42
| | | | | | from emitting gp_rel relocation. llvm-svn: 154122
* Patch to set is_stmt a little better for prologue lines in a function.Eric Christopher2012-04-052-4/+9
| | | | | | | | | This enables debuggers to see what are interesting lines for a breakpoint rather than any line that starts a function. rdar://9852092 llvm-svn: 154120
* Don't break the IV update in TLI::SimplifySetCC().Jakob Stoklund Olesen2012-04-053-23/+72
| | | | | | | | | | | | | | | | | | | LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> llvm-svn: 154119
* Fix accidentally inverted logic from r152803, and make theDan Gohman2012-04-052-1/+7
| | | | | | testcase slightly less trivial. This fixes rdar://11171718. llvm-svn: 154118
* Fix a problem in the target detection for Debian GNU/HURDSylvestre Ledru2012-04-054-0/+18
| | | | llvm-svn: 154117
* Fix a problem in the target detection for Debian GNU/kFreeBSDSylvestre Ledru2012-04-054-6/+6
| | | | llvm-svn: 154114
* Treat f16 the same as f80/f128 for the purposes of generating constants ↵Owen Anderson2012-04-051-1/+2
| | | | | | during instruction selection. llvm-svn: 154113
* Added support for unpredictable ADC/SBC instructions on ARM, and also fixed ↵Silviu Baranga2012-04-052-4/+21
| | | | | | some corner cases involving the PC register as an operand for these instructions. llvm-svn: 154101
* Added support for handling unpredictable arithmetic instructions on ARM.Silviu Baranga2012-04-053-12/+9
| | | | llvm-svn: 154100
* BBVectorize: Add the const modifier to the VectorizeConfig because we won'tHongbin Zheng2012-04-051-1/+1
| | | | | | modify it. llvm-svn: 154098
* Introduce the VectorizeConfig class, with which we can control the behaviorHongbin Zheng2012-04-052-34/+126
| | | | | | | | | of the BBVectorizePass without using command line option. As pointed out by Hal, we can ask the TargetLoweringInfo for the architecture specific VectorizeConfig to perform vectorizing with architecture specific information. llvm-svn: 154096
* An oversight when applying the patches for r150956 and r150957 to a vanilla ↵James Molloy2012-04-052-0/+76
| | | | | | | | tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090
* Add the function "vectorizeBasicBlock" which allow users vectorize aHongbin Zheng2012-04-052-6/+32
| | | | | | | BasicBlock in other passes, e.g. we can call vectorizeBasicBlock in the loop unroll pass right after the loop is unrolled. llvm-svn: 154089
* ARM assembly aliases for two-operand V[R]SHR instructions.Jim Grosbach2012-04-052-5/+106
| | | | | | rdar://11189467 llvm-svn: 154087
* In MemoryBuffer::getOpenFile() make sure that the buffer is null-terminated ifArgyrios Kyrtzidis2012-04-051-0/+11
| | | | | | | | | | | | | the caller requested a null-terminated one. When mapping the file there could be a racing issue that resulted in the file being larger than the FileSize passed by the caller. We already have an assertion for this in MemoryBuffer::init() but have a runtime guarantee that the buffer will be null-terminated, so do a copy that adds a null-terminator. Protects against crash of rdar://11161822. llvm-svn: 154082
* ARM assembly parsing for 'msr' plain 'cpsr' operand.Jim Grosbach2012-04-052-1/+4
| | | | | | | | Plain 'cpsr' is an alias for 'cpsr_fc'. rdar://11153753 llvm-svn: 154080
* Pass the right sign to TLI->isLegalICmpImmediate.Jakob Stoklund Olesen2012-04-052-2/+15
| | | | | | | | | | | | | | | | | | LSR can fold three addressing modes into its ICmpZero node: ICmpZero BaseReg + Offset => ICmp BaseReg, -Offset ICmpZero -1*ScaleReg + Offset => ICmp ScaleReg, Offset ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg The first two cases are only used if TLI->isLegalICmpImmediate() likes the offset. Make sure the right Offset sign is passed to this method in the second case. The ARM version is not symmetric. <rdar://problem/11184260> llvm-svn: 154079
* Do not include multiple -arch options in CPPFLAGS.Bob Wilson2012-04-051-3/+2
| | | | llvm-svn: 154070
OpenPOWER on IntegriCloud