summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Remove few if-then-else when both branches are theFariborz Jahanian2012-03-272-16/+8
| | | | | | same. pr12357. llvm-svn: 153515
* Log the allocator messages at a higher verbosity level.Alexander Potapenko2012-03-271-1/+1
| | | | llvm-svn: 153514
* fix what looks like a real logic bug, found by PVS-Studio (part of PR12357)Chris Lattner2012-03-271-2/+2
| | | | llvm-svn: 153513
* Commit patch reverted in r153454 with the modified testFariborz Jahanian2012-03-272-7/+12
| | | | | | case that I forgot to check in. llvm-svn: 153512
* Add an MRI::tracksLiveness() flag.Jakob Stoklund Olesen2012-03-273-1/+28
| | | | | | | | | | | | | | | | | | | | Late optimization passes like branch folding and tail duplication can transform the machine code in a way that makes it expensive to keep the register liveness information up to date. There is a fuzzy line between register allocation and late scheduling where the liveness information degrades. The MRI::tracksLiveness() flag makes the line clear: While true, liveness information is accurate, and can be used for register scavenging. Once the flag is false, liveness information is not accurate, and can only be used as a hint. Late passes generally don't need the liveness information, but they will sometimes use the register scavenger to help update it. The scavenger enforces strict correctness, and we have to spend a lot of code to update register liveness that may never be used. llvm-svn: 153511
* llvm/docs/*.html: Fix markups.NAKAMURA Takumi2012-03-2713-22/+52
| | | | llvm-svn: 153508
* Make a seemingly tiny change to the inliner and fix the generated codeChandler Carruth2012-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | size bloat. Unfortunately, I expect this to disable the majority of the benefit from r152737. I'm hopeful at least that it will fix PR12345. To explain this requires... quite a bit of backstory I'm afraid. TL;DR: The change in r152737 actually did The Wrong Thing for linkonce-odr functions. This change makes it do the right thing. The benefits we saw were simple luck, not any actual strategy. Benchmark numbers after a mini-blog-post so that I've written down my thoughts on why all of this works and doesn't work... To understand what's going on here, you have to understand how the "bottom-up" inliner actually works. There are two fundamental modes to the inliner: 1) Standard fixed-cost bottom-up inlining. This is the mode we usually think about. It walks from the bottom of the CFG up to the top, looking at callsites, taking information about the callsite and the called function and computing th expected cost of inlining into that callsite. If the cost is under a fixed threshold, it inlines. It's a touch more complicated than that due to all the bonuses, weights, etc. Inlining the last callsite to an internal function gets higher weighth, etc. But essentially, this is the mode of operation. 2) Deferred bottom-up inlining (a term I just made up). This is the interesting mode for this patch an r152737. Initially, this works just like mode #1, but once we have the cost of inlining into the callsite, we don't just compare it with a fixed threshold. First, we check something else. Let's give some names to the entities at this point, or we'll end up hopelessly confused. We're considering inlining a function 'A' into its callsite within a function 'B'. We want to check whether 'B' has any callers, and whether it might be inlined into those callers. If so, we also check whether inlining 'A' into 'B' would block any of the opportunities for inlining 'B' into its callers. We take the sum of the costs of inlining 'B' into its callers where that inlining would be blocked by inlining 'A' into 'B', and if that cost is less than the cost of inlining 'A' into 'B', then we skip inlining 'A' into 'B'. Now, in order for #2 to make sense, we have to have some confidence that we will actually have the opportunity to inline 'B' into its callers when cheaper, *and* that we'll be able to revisit the decision and inline 'A' into 'B' if that ever becomes the correct tradeoff. This often isn't true for external functions -- we can see very few of their callers, and we won't be able to re-consider inlining 'A' into 'B' if 'B' is external when we finally see more callers of 'B'. There are two cases where we believe this to be true for C/C++ code: functions local to a translation unit, and functions with an inline definition in every translation unit which uses them. These are represented as internal linkage and linkonce-odr (resp.) in LLVM. I enabled this logic for linkonce-odr in r152737. Unfortunately, when I did that, I also introduced a subtle bug. There was an implicit assumption that the last caller of the function within the TU was the last caller of the function in the program. We want to bonus the last caller of the function in the program by a huge amount for inlining because inlining that callsite has very little cost. Unfortunately, the last caller in the TU of a linkonce-odr function is *not* the last caller in the program, and so we don't want to apply this bonus. If we do, we can apply it to one callsite *per-TU*. Because of the way deferred inlining works, when it sees this bonus applied to one callsite in the TU for 'B', it decides that inlining 'B' is of the *utmost* importance just so we can get that final bonus. It then proceeds to essentially force deferred inlining regardless of the actual cost tradeoff. The result? PR12345: code bloat, code bloat, code bloat. Another result is getting *damn* lucky on a few benchmarks, and the over-inlining exposing critically important optimizations. I would very much like a list of benchmarks that regress after this change goes in, with bitcode before and after. This will help me greatly understand what opportunities the current cost analysis is missing. Initial benchmark numbers look very good. WebKit files that exhibited the worst of PR12345 went from growing to shrinking compared to Clang with r152737 reverted. - Bootstrapped Clang is 3% smaller with this change. - Bootstrapped Clang -O0 over a single-source-file of lib/Lex is 4% faster with this change. Please let me know about any other performance impact you see. Thanks to Nico for reporting and urging me to actually fix, Richard Smith, Duncan Sands, Manuel Klimek, and Benjamin Kramer for talking through the issues today. llvm-svn: 153506
* Out of tree build support: Set TARGET_TRIPLE from the result of "llvm-config ↵Hongbin Zheng2012-03-271-2/+22
| | | | | | | | | --host-target" instead of loading the "LLVMConfig.cmake" which is only installed when llvm configured by cmake. llvm-svn: 153503
* Prune some includesCraig Topper2012-03-2730-33/+6
| | | | llvm-svn: 153502
* Update the ARC specification for several changes made in theJohn McCall2012-03-271-39/+285
| | | | | | | | | last N months. This required a brief soliloquy about change in an uncertainly-versioned world. I believe I've gotten the right target versions on all these changes. llvm-svn: 153501
* Remove unnecessary llvm:: qualificationsCraig Topper2012-03-2718-261/+261
| | | | llvm-svn: 153500
* Pass the llvm IR pointer value and offset to the constructor ofAkira Hatanaka2012-03-271-9/+13
| | | | | | | | | | | | MachinePointerInfo when getStore is called to create a node that stores an argument passed in register to the stack. Without this change, the post RA scheduler will fail to discover the dependencies between the stores instructions and the instructions that load from a structure passed by value. The link to the related discussion is here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-March/048055.html llvm-svn: 153499
* Fix bug in LowerConstantPool. Akira Hatanaka2012-03-271-1/+1
| | | | llvm-svn: 153498
* Add T9 to the list of live-in registers of the entry basic block. Akira Hatanaka2012-03-271-0/+2
| | | | llvm-svn: 153497
* Fixed a few things in the ELF object file:Greg Clayton2012-03-273-19/+21
| | | | | | | | | 1 - sections only get a valid VM size if they have SHF_ALLOC in the section flags 2 - symbol names are marked as mangled if they start with "_Z" Also fixed the DWARF parser to correctly use the section file size when extracting the DWARF. llvm-svn: 153496
* Synthetic values are now automatically enabled and active by default. ↵Enrico Granata2012-03-2721-113/+346
| | | | | | | | | | | | SBValue is set up to always wrap a synthetic value when one is available. A new setting enable-synthetic-value is provided on the target to disable this behavior. There also is a new GetNonSyntheticValue() API call on SBValue to go back from synthetic to non-synthetic. There is no call to go from non-synthetic to synthetic. The test suite has been changed accordingly. Fallout from changes to type searching: an hack has to be played to make it possible to use maps that contain std::string due to the special name replacement operated by clang Fixing a test case that was using libstdcpp instead of libc++ - caught as a consequence of said changes to type searching llvm-svn: 153495
* Retrieve and add the offset of a symbol in applyFixup rather than retrieve andAkira Hatanaka2012-03-272-67/+67
| | | | | | | set it in MipsMCCodeEmitter::getMachineOpValue. Assert in getMachineOpValue if MachineOperand MO is of an unexpected type. llvm-svn: 153494
* Define function MipsGetSymAndOffset which returns a fixup's symbol and theAkira Hatanaka2012-03-271-0/+30
| | | | | | offset applied to it. llvm-svn: 153493
* Post-ra LICM should take care not to hoist an instruction that would clobber aEvan Cheng2012-03-272-4/+87
| | | | | | | | register that's read by the preheader terminator. rdar://11095580 llvm-svn: 153492
* Rewrite computation of Value in adjustFixupValue so that the upper 48-bits areAkira Hatanaka2012-03-271-1/+1
| | | | | | cleared. No functionality change. llvm-svn: 153491
* Add cross-referencing comments to ParseDirectDeclarator to note thatRichard Smith2012-03-271-2/+6
| | | | | | | isConstructorDeclaration also needs updating for any extension to the grammar of a direct-declarator. llvm-svn: 153490
* Change RetainCountChecker to eagerly "escape" retained objects when they areTed Kremenek2012-03-273-1/+35
| | | | | | | | | | | | assigned to a struct. This is fallout from inlining results, which expose far more patterns where people stuff CF objects into structs and pass them around (and we can reason about it). The problem is that we don't have a general way to detect when values have escaped, so as an intermediate step we need to eagerly prune out such tracking. Fixes <rdar://problem/11104566>. llvm-svn: 153489
* When we see 'Class(X' or 'Class::Class(X' and we suspect that it names aRichard Smith2012-03-273-8/+62
| | | | | | | | | | | | | constructor, but X is not a known typename, check whether the tokens could possibly match the syntax of a declarator before concluding that it isn't a constructor. If it's definitely ill-formed, assume it is a constructor. Empirical evidence suggests that this pattern is much more often a constructor with a typoed (or not-yet-declared) type name than any of the other possibilities, so the extra cost of the check is not expected to be problematic. llvm-svn: 153488
* During MachineCopyPropagation a register may be the source operand of multipleLang Hames2012-03-271-17/+26
| | | | | | | | | | copies being considered for removal. Make sure to track all of the copies, rather than just the most recent encountered, by holding a DenseSet instead of an unsigned in SrcMap. No test case - couldn't reduce something with a sane size. llvm-svn: 153487
* Reserve hardware registers.Akira Hatanaka2012-03-271-0/+4
| | | | llvm-svn: 153486
* [driver] Put -cpp-precomp and -no-cpp-precomp under the clang_ignored_f_group.Chad Rosier2012-03-262-2/+7
| | | | | | | We don't currently support these options. rdar://11120518 llvm-svn: 153485
* ARM has a peephole optimization which looks for a def / use pair. The defEvan Cheng2012-03-262-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | produces a 32-bit immediate which is consumed by the use. It tries to fold the immediate by breaking it into two parts and fold them into the immmediate fields of two uses. e.g movw r2, #40885 movt r3, #46540 add r0, r0, r3 => add.w r0, r0, #3019898880 add.w r0, r0, #30146560 ; However, this transformation is incorrect if the user produces a flag. e.g. movw r2, #40885 movt r3, #46540 adds r0, r0, r3 => add.w r0, r0, #3019898880 adds.w r0, r0, #30146560 Note the adds.w may not set the carry flag even if the original sequence would. rdar://11116189 llvm-svn: 153484
* Add a debug option to dump PBQP graphs during register allocation.Lang Hames2012-03-262-0/+66
| | | | llvm-svn: 153483
* <rdar://problem/11113279>Greg Clayton2012-03-2644-259/+581
| | | | | | | | | | Fixed type lookups to "do the right thing". Prior to this fix, looking up a type using "foo::bar" would result in a type list that contains all types that had "bar" as a basename unless the symbol file was able to match fully qualified names (which our DWARF parser does not). This fix will allow type matches to be made based on the basename and then have the types that don't match filtered out. Types by name can be fully qualified, or partially qualified with the new "bool exact_match" parameter to the Module::FindTypes() method. This fixes some issue that we discovered with dynamic type resolution as well as improves the overall type lookups in LLDB. llvm-svn: 153482
* [driver] Testcase for r153469, r153470, and r153478.Chad Rosier2012-03-261-0/+4
| | | | llvm-svn: 153481
* SCEV fix: Handle loop invariant loads.Andrew Trick2012-03-262-1/+52
| | | | | | Fixes PR11882: NULL dereference in ComputeLoadConstantCompareExitLimit. llvm-svn: 153480
* Add 'undef's to make SWIG happier. Patch by Baozeng Ding.Bill Wendling2012-03-261-0/+3
| | | | llvm-svn: 153479
* [driver] Fix unused argument warnings.Chad Rosier2012-03-261-11/+19
| | | | | | | | | | | | 1. Don't short-circuit conditional statements that are checking flags. Otherwise, the driver emits warnings about unused arguments. 2. -mkernel and -fapple-kext imply no exceptions, so claim exception related arguments now to avoid warnings about unused arguments. rdar://11120518 llvm-svn: 153478
* If creation of watchpoint failed on the device, make sure the list ↵Johnny Chen2012-03-261-1/+5
| | | | | | maintained by the target reflects that by cleaning it up. llvm-svn: 153477
* Add InitializeNativeTargetDisassembler function.Eric Christopher2012-03-2610-2/+65
| | | | | | Patch by Ojab. llvm-svn: 153476
* Unit test for PR11950: LSR crash.Andrew Trick2012-03-261-0/+49
| | | | llvm-svn: 153472
* Use the file in the inlined die rather than the compile unit forEric Christopher2012-03-261-1/+2
| | | | | | | | | | | | | | backtrace locations. Testcase forthcoming, but I wanted to get some testing here. Should fix: PR12323 PR12314 rdar://11091100 llvm-svn: 153471
* [driver] -mkernel implies -fno-common, so claim the arg to avoid an unusedChad Rosier2012-03-261-0/+1
| | | | | | | argument warning. Part of rdar://11120518 llvm-svn: 153470
* [driver] -mkernel implies -fno-builtin, so claim the arg to avoid an unusedChad Rosier2012-03-261-0/+1
| | | | | | | argument warning. Part of rdar://11120518 llvm-svn: 153469
* 153465 was incorrect. In this code we wanted to check that the pointer ↵Nadav Rotem2012-03-261-4/+3
| | | | | | operand is of pointer type (and not vector type). llvm-svn: 153468
* <rdar://problem/11022964>Sean Callanan2012-03-261-0/+73
| | | | | | | | | Patched LLVM to handle generic i386 relocations. This avoids some sudden termination problems on i386 where the JIT would exit() out reporting "Invalid CPU type!" llvm-svn: 153467
* Made RuntimeDyldMachO support vanilla i386Sean Callanan2012-03-262-0/+44
| | | | | | | | | relocations. The algorithm is the same as that for x86_64. Scattered relocations, a feature present in i386 but not on x86_64, are not yet supported. llvm-svn: 153466
* PR12357: The pointer was used before it was checked.Nadav Rotem2012-03-261-1/+3
| | | | llvm-svn: 153465
* Forward-declared enumerations are now complete, except for an interactionRichard Smith2012-03-262-1/+2
| | | | | | | | between unscoped enumerations and class template member specializations, whose behavior is currently under discussion in CWG (and for which there is a preference to not implement the currently-standardized wording). llvm-svn: 153464
* LSR ivchain bug fix: corner case with ConstantExpr.Andrew Trick2012-03-261-2/+3
| | | | | | Fixes PR11950. llvm-svn: 153463
* comment typoAndrew Trick2012-03-261-1/+1
| | | | llvm-svn: 153462
* Add a special-case diagnostic for one of the more obnoxious special cases ofRichard Smith2012-03-263-0/+58
| | | | | | | | | | | | | | | unscoped enumeration members: an enumerator name which is visible in the out-of-class definition of a member of a templated class might not actually exist in the instantiation of that class, if the enumeration is also lexically defined outside the class definition and is explicitly specialized. Depending on the result of a CWG discussion, we may have a different resolution for a class of problems in this area, but this fixes the immediate issue of a crash-on-invalid / accepts-invalid (depending on +Asserts). Thanks to Johannes Schaub for digging into the standard wording to find how this case is currently specified to behave. llvm-svn: 153461
* [tests] Fix test failure in release mode.Daniel Dunbar2012-03-261-1/+1
| | | | llvm-svn: 153460
* Simplify code, no functionality change.Benjamin Kramer2012-03-261-6/+4
| | | | llvm-svn: 153459
* eliminate an unneeded branch, part of PR12357Chris Lattner2012-03-261-7/+2
| | | | llvm-svn: 153458
OpenPOWER on IntegriCloud