summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [asan] don't instrument functions with available_externally linkage. This ↵Kostya Serebryany2013-03-181-0/+1
| | | | | | saves a bit of compile time and reduces the number of redundant global strings generated by asan (https://code.google.com/p/address-sanitizer/issues/detail?id=167) llvm-svn: 177250
* LoopVectorize: Invert case when we use a vector cmp value to query select costArnold Schwaighofer2013-03-141-1/+1
| | | | | | | We generate a select with a vectorized condition argument when the condition is NOT loop invariant. Not the other way around. llvm-svn: 177098
* Perform factorization as a last resort of unsafe fadd/fsub simplification.Shuxin Yang2013-03-141-5/+91
| | | | | | | | | | | | | | | Rules include: 1)1 x*y +/- x*z => x*(y +/- z) (the order of operands dosen't matter) 2) y/x +/- z/x => (y +/- z)/x The transformation is disabled if the new add/sub expr "y +/- z" is a denormal/naz/inifinity. rdar://12911472 llvm-svn: 177088
* [ASan] emit instrumentation for initialization order checking by defaultAlexey Samsonov2013-03-141-2/+2
| | | | llvm-svn: 177063
* PR14972: SROA vs. GVN exposed a really bad bug in SROA.Chandler Carruth2013-03-141-117/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fundamental problem is that SROA didn't allow for overly wide loads where the bits past the end of the alloca were masked away and the load was sufficiently aligned to ensure there is no risk of page fault, or other trapping behavior. With such widened loads, SROA would delete the load entirely rather than clamping it to the size of the alloca in order to allow mem2reg to fire. This was exposed by a test case that neatly arranged for GVN to run first, widening certain loads, followed by an inline step, and then SROA which miscompiles the code. However, I see no reason why this hasn't been plaguing us in other contexts. It seems deeply broken. Diagnosing all of the above took all of 10 minutes of debugging. The really annoying aspect is that fixing this completely breaks the pass. ;] There was an implicit reliance on the fact that no loads or stores extended past the alloca once we decided to rewrite them in the final stage of SROA. This was used to encode information about whether the loads and stores had been split across multiple partitions of the original alloca. That required threading explicit tracking of whether a *use* of a partition is split across multiple partitions. Once that was done, another problem arose: we allowed splitting of integer loads and stores iff they were loads and stores to the entire alloca. This is a really arbitrary limitation, and splitting at least some integer loads and stores is crucial to maximize promotion opportunities. My first attempt was to start removing the restriction entirely, but currently that does Very Bad Things by causing *many* common alloca patterns to be fully decomposed into i8 operations and lots of or-ing together to produce larger integers on demand. The code bloat is terrifying. That is still the right end-goal, but substantial work must be done to either merge partitions or ensure that small i8 values are eagerly merged in some other pass. Sadly, figuring all this out took essentially all the time and effort here. So the end result is that we allow splitting only when the load or store at least covers the alloca. That ensures widened loads and stores don't hurt SROA, and that we don't rampantly decompose operations more than we have previously. All of this was already fairly well tested, and so I've just updated the tests to cover the wide load behavior. I can add a test that crafts the pass ordering magic which caused the original PR, but that seems really brittle and to provide little benefit. The fundamental problem is that widened loads should Just Work. llvm-svn: 177055
* Remove accidentally committed debug line.Nick Lewycky2013-03-141-1/+0
| | | | llvm-svn: 177005
* Refactor GCOV's six constructor arguments into a struct with a getter thatNick Lewycky2013-03-141-42/+59
| | | | | | | | | | constructs default arguments. It can now take default arguments from cl::opt'ions. Add a new -default-gcov-version=... option, and actually test it! Sink the reverse-order of the version into GCOVProfiling, hiding it from our users. llvm-svn: 177002
* No functionality change. Rename emitGCNO() to the more sensibleNick Lewycky2013-03-131-7/+7
| | | | | | | | | emitProfileNotes(), similar to emitProfileArcs(). Also update its comment. Also add a comment on Version[4] (there will be another comment in clang later), and compress lines that exceeded 80 columns. llvm-svn: 176994
* Fix a performance regression when combining to smaller types in icmp (shl ↵Arnaud A. de Grandmaison2013-03-131-3/+4
| | | | | | | | %v, C1), C2 : Only combine when the shl is only used by the icmp llvm-svn: 176950
* Change the order of the operands in patchAndReplaceAllUsesWith soDan Gohman2013-03-121-5/+5
| | | | | | that they're more consistent with Value::replaceAllUsesWith. llvm-svn: 176872
* LibCallSimplifier: optimize speed for short-lived instancesMeador Inge2013-03-121-177/+225
| | | | | | | | | | | | | | | | | | | | | | | | | | Nadav reported a performance regression due to the work I did to merge the library call simplifier into instcombine [1]. The issue is that a new LibCallSimplifier object is being created whenever InstCombiner::runOnFunction is called. Every time a LibCallSimplifier object is used to optimize a call it creates a hash table to map from a function name to an object that optimizes functions of that name. For short-lived LibCallSimplifier instances this is quite inefficient. Especially for cases where no calls are actually simplified. This patch fixes the issue by dropping the hash table and implementing an explicit lookup function to correlate the function name to the object that optimizes functions of that name. This avoids the cost of always building and destroying the hash table in cases where the LibCallSimplifier object is short-lived and avoids the cost of building the table when no simplifications are actually preformed. On a benchmark containing 100,000 calls where none of them are simplified I noticed a 30% speedup. On a benchmark containing 100,000 calls where all of them are simplified I noticed an 8% speedup. [1] http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130304/167639.html llvm-svn: 176840
* Don't remove a landing pad if the invoke requires a table entry.Bill Wendling2013-03-111-3/+17
| | | | | | | | An invoke may require a table entry. For instance, when the function it calls is expected to throw. <rdar://problem/13360379> llvm-svn: 176827
* Use LLVMBool instead of 'bool' in the C API. Based on a patch by Peter Zotov!Nick Lewycky2013-03-101-3/+3
| | | | llvm-svn: 176793
* BBVectorize: Fixup debugging statementsHal Finkel2013-03-101-2/+2
| | | | | | | After the recent data-structure improvements, a couple of debugging statements were broken (printing pointer values). llvm-svn: 176791
* Remove a source of nondeterminism from the LoopVectorizer.Benjamin Kramer2013-03-091-1/+1
| | | | | | | This made us emit runtime checks in a random order. Hopefully bootstrap miscompares will go away now. llvm-svn: 176775
* LoopVectorizer: Ignore all dbg intrinisicArnold Schwaighofer2013-03-091-6/+6
| | | | | | Ignore all DbgIntriniscInfo instructions instead of just DbgValueInst. llvm-svn: 176769
* LoopVectorizer: Ignore dbg.value instructionsArnold Schwaighofer2013-03-091-2/+11
| | | | | | | | | We want vectorization to happen at -g. Ignore calls to the dbg.value intrinsic and don't transfer them to the vectorized code. radar://13378964 llvm-svn: 176768
* Simplify code. No functionality change.Jakub Staszak2013-03-091-2/+2
| | | | llvm-svn: 176765
* Use the correct index variable. This is the meat of what was supposed to be inNick Lewycky2013-03-091-1/+1
| | | | | | r176751. Also, learn a lesson about applying patches by hand/eyeball. llvm-svn: 176764
* Fix bug introduced in r176616 when making function identifier numbers stable.Nick Lewycky2013-03-091-5/+3
| | | | | | Count the subprograms, not the compile units. llvm-svn: 176751
* Don't emit the extra checksum into the .gcda file if the user hasn't asked forNick Lewycky2013-03-091-4/+6
| | | | | | | | it. Fortunately, versions of gcov that predate the extra checksum also ignore any extra data, so this isn't a problem. There will be a matching commit in compiler-rt. llvm-svn: 176745
* Insert the reduction start value into the first bypass block to preserve ↵Benjamin Kramer2013-03-081-1/+1
| | | | | | | | domination. Fixes PR15344. llvm-svn: 176701
* Keep coding stanard.Jakub Staszak2013-03-071-4/+3
| | | | llvm-svn: 176661
* Don't create IRBuilder if we can return from the method earlier.Jakub Staszak2013-03-071-2/+2
| | | | llvm-svn: 176660
* Fixed a crash when cloning a function into a function withPekka Jaaskelainen2013-03-071-3/+6
| | | | | | | different size argument list and without attributes in the arguments. llvm-svn: 176632
* Switch from a version 4.2/4.4 switch to a four-byte version string to be putNick Lewycky2013-03-071-23/+25
| | | | | | | | | into the actual gcov file. Instead of using the bottom 4 bytes as the function identifier, use a counter. This makes the identifier numbers stable across multiple runs. llvm-svn: 176616
* SimplifyCFG fix for volatile load/store.Andrew Trick2013-03-071-2/+4
| | | | | | | | | | | | | Fixes rdar:13349374. Volatile loads and stores need to be preserved even if the language standard says they are undefined. "volatile" in this context means "get out of the way compiler, let my platform handle it". Additionally, this is the only way I know of with llvm to write to the first page (when hardware allows) without dropping to assembly. llvm-svn: 176599
* Generalize my previous fix for -print-options.Andrew Trick2013-03-061-1/+1
| | | | | | | Always print options that differ from their implicit default. At least for simple option types. llvm-svn: 176572
* Give -loop-vectorize an explicit default.Andrew Trick2013-03-061-1/+1
| | | | | | This way, clang -mllvm -print-options shows that the driver is overriding it. llvm-svn: 176569
* InstCombine: Don't shrink allocas when combining with a bitcast.Jim Grosbach2013-03-061-0/+6
| | | | | | | | | | When considering folding a bitcast of an alloca into the alloca itself, make sure we don't shrink the amount of memory being allocated, or things rapidly go sideways. rdar://13324424 llvm-svn: 176547
* Check isDiscardableIfUnused, rather than hasLocalLinkage, when bumpingLang Hames2013-03-041-3/+3
| | | | | | | GlobalValue linkage up to ExternalLinkage in the ExtractGV pass. This prevents linkonce and linkonce_odr symbols from being DCE'd. llvm-svn: 176459
* Bypass Slow DividesPreston Gurd2013-03-042-3/+3
| | | | | | | | | | | | | * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
* PR14448 - prevent the loop vectorizer from vectorizing the same loop twice.Nadav Rotem2013-03-021-0/+18
| | | | | | | | | | The LoopVectorizer often runs multiple times on the same function due to inlining. When this happens the loop vectorizer often vectorizes the same loops multiple times, increasing code size and adding unneeded branches. With this patch, the vectorizer during vectorization puts metadata on scalar loops and marks them as 'already vectorized' so that it knows to ignore them when it sees them a second time. PR14448. llvm-svn: 176399
* Modify {Call,Invoke}Inst::addAttribute to take an AttrKind.Peter Collingbourne2013-03-021-2/+1
| | | | llvm-svn: 176397
* LoopVectorize: Don't hang forever if a PHI only has skipped PHI uses.Benjamin Kramer2013-03-011-1/+8
| | | | | | Fixes PR15384. llvm-svn: 176366
* Fix a bug in instcombine for fmul in fast math mode.Quentin Colombet2013-02-281-3/+3
| | | | | | | | | | | | | | | The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300
* [msan] Implement sanitize_memory attribute.Evgeniy Stepanov2013-02-281-14/+38
| | | | | | | | | | Shadow checks are disabled and memory loads always produce fully initialized values in functions that don't have a sanitize_memory attribute. Value and argument shadow is propagated as usual. This change also updates blacklist behaviour to match the above. llvm-svn: 176247
* Remove unused leftover declarations.Evgeniy Stepanov2013-02-281-5/+0
| | | | llvm-svn: 176240
* LoopVectorize: Vectorize math builtin calls.Benjamin Kramer2013-02-271-50/+137
| | | | | | | | | | | This properly asks TargetLibraryInfo if a call is available and if it is, it can be translated into the corresponding LLVM builtin. We don't vectorize sqrt() yet because I'm not sure about the semantics for negative numbers. The other intrinsic should be exact equivalents to the libm functions. Differential Revision: http://llvm-reviews.chandlerc.com/D465 llvm-svn: 176188
* In GCC 4.7, function names are now forbidden from .gcda files. Support this byNick Lewycky2013-02-271-8/+14
| | | | | | | passing a null pointer to the function name in to GCDAProfiling, and add another switch onto GCOVProfiling. llvm-svn: 176173
* Doh, fix behaviour change introduced in r176168 which is tested in clang,Nick Lewycky2013-02-271-1/+3
| | | | | | not llvm. llvm-svn: 176172
* For each function that we optimize we initialize a new list of lib ↵Nadav Rotem2013-02-271-1/+2
| | | | | | functions. For each function name we malloc memory. This patch changes the Libcall map to use BumpPtrAllocator. Now we malloc only once. This speeds up instcombine by a few % on a large c++ program. llvm-svn: 176170
* IRBuilder has grown all sorts of useful utility functions. Make use of them toNick Lewycky2013-02-271-25/+16
| | | | | | clean up this code a tiny bit. No functionality change. llvm-svn: 176168
* Enhance integer division emulation support to handle types smaller than 32 bits,Pedro Artigas2013-02-261-0/+104
| | | | | | | | enhancement done the trivial way; by extending inputs and truncating outputs which is addequate for targets with little or no support for integer arithmetic on integer types less than 32 bits. llvm-svn: 176139
* Unify clang/llvm attributes for asan/tsan/msan (LLVM part)Kostya Serebryany2013-02-261-2/+2
| | | | | | | | | | | | | | | | | | | These are two related changes (one in llvm, one in clang). LLVM: - rename address_safety => sanitize_address (the enum value is the same, so we preserve binary compatibility with old bitcode) - rename thread_safety => sanitize_thread - rename no_uninitialized_checks -> sanitize_memory CLANG: - add __attribute__((no_sanitize_address)) as a synonym for __attribute__((no_address_safety_analysis)) - add __attribute__((no_sanitize_thread)) - add __attribute__((no_sanitize_memory)) for S in address thread memory If -fsanitize=S is present and __attribute__((no_sanitize_S)) is not set llvm attribute sanitize_S llvm-svn: 176075
* CVP: If we have a PHI with an incoming select, try to skip the select.Benjamin Kramer2013-02-241-5/+24
| | | | | | | | | | This is a common pattern with dyn_cast and similar constructs, when the PHI no longer depends on the select it can often be turned into a simpler construct or even get hoisted out of the loop. PR15340. llvm-svn: 175995
* Fixed a careless mistake.Michael Gottesman2013-02-231-1/+1
| | | | | | rdar://13273675. llvm-svn: 175939
* Implement the NoBuiltin attribute.Bill Wendling2013-02-222-1/+2
| | | | | | | | The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835
* Allow GlobalValues to vectorize with AliasAnalysisRenato Golin2013-02-211-35/+154
| | | | | | | | | | | | | | | | | | | | | Storing the load/store instructions with the values and inspect them using Alias Analysis to make sure they don't alias, since the GEP pointer operand doesn't take the offset into account. Trying hard to not add any extra cost to loads and stores that don't overlap on global values, AA is *only* calculated if all of the previous attempts failed. Using biggest vector register size as the stride for the vectorization access, as we're being conservative and the cost model (which calculates the real vectorization factor) is only run after the legalization phase. We might re-think this relationship in the future, but for now, I'd rather be safe than sorry. llvm-svn: 175818
* Remove dead code and whitespace.Chad Rosier2013-02-211-10/+0
| | | | llvm-svn: 175804
OpenPOWER on IntegriCloud