summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [OPENMP] Propagate alignment from original variables to the private copies.Alexey Bataev2015-09-106-80/+100
| | | | | | Currently private copies of captured variables have default alignment. Patch makes private variables to have same alignment as original variables. llvm-svn: 247260
* [ADT] Force inline several super boring and unusually hot methods onChandler Carruth2015-09-101-0/+7
| | | | | | | | | | | | | | SmallVector to further help debug builds not waste their time calling one line functions. To give you an idea of why this is worthwhile, this change alone gets another >10% reduction in the runtime of TripleTest.Normalization! It's now under 9 seconds for me. Sadly, this is the end of the easy wins for that test. Anything further will require some different architecture of the test itself. Still, I'm pretty happy. 'check-llvm' now is under 35s for me. llvm-svn: 247259
* Add a deprecation notice to the clang-modernize documentation.Alexander Kornienko2015-09-101-28/+9
| | | | | | | | | | | | | | Summary: Add a deprecation notice to the clang-modernize documentation. Remove the reference to the external JIRA tracker. Reviewers: revane, klimek Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D12732 llvm-svn: 247258
* [ADT] Micro-optimize and force inlining for string switches.Chandler Carruth2015-09-101-5/+45
| | | | | | | | | | | | | | These are now quite heavily used in unit tests and the host tools, making it worth having them be reasonably fast even in an unoptimized build. This change reduces the total runtime of TripleTest.Normalization by yet another 10% to 15%. It is now under 10 seconds on my machine, and the total check-llvm time has dropped from 38s to around 36s. I experimented with a number of different options, and the code pattern here consistently seemed to lower the cleanest, likely due to the significantly simple CFG and far fewer redundant tests of 'Result'. llvm-svn: 247257
* Fix an AttributeError in dotest.py if --executable points to a wrong placeIlia K2015-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following case: ``` $ ./dotest.py --executable=~/p/llvm/build_ninja/bin/lldb tools/lldb-mi/ '~/p/llvm/build_ninja/bin/lldb' is not a path to a valid executable Traceback (most recent call last): File "./dotest.py", line 1306, in <module> setupSysPath() File "./dotest.py", line 1004, in setupSysPath if not lldbtest_config.lldbExec: AttributeError: 'module' object has no attribute 'lldbExec' ``` And with this fix: ``` $ ./dotest.py --executable=~/p/llvm/build_ninja/bin/lldb tools/lldb-mi/ '~/p/llvm/build_ninja/bin/lldb' is not a path to a valid executable The 'lldb' executable cannot be located. Some of the tests may not be run as a result. ``` llvm-svn: 247256
* [OPENMP] Fix test incompatibility with 32-bit platformsAlexey Bataev2015-09-101-2/+2
| | | | llvm-svn: 247255
* [ARM] Do not use vtrn for vectorshuffle if the order is reversedJames Molloy2015-09-104-4/+63
| | | | | | | | The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254
* [ADT] Apply a large hammer to StringRef functions: attribute always_inline.Chandler Carruth2015-09-101-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logic of this follows something Howard does in libc++ and something I discussed with Chris eons ago -- for a lot of functions, there is really no benefit to preserving "debug information" by leaving the out-of-line even in debug builds. This is especially true as we now do a very good job of preserving most debug information even in the face of inlining. There are a bunch of methods in StringRef that we are paying a completely unacceptable amount for with every debug build of every LLVM developer. Some day, we should fix Clang/LLVM so that developers can reasonable use a default of something other than '-O0' and not waste their lives waiting on *completely* unoptimized code to execute. We should have a default that doesn't impede debugging while providing at least plausable performance. But today is not that day. So today, I'm applying always_inline to the functions that are really hurting the critical path for stuff like 'check_llvm'. I'm being very cautious here, but there are a few other APIs that we really should do this for as a matter of pragmatism. Hopefully we can rip this out some day. With this change, TripleTest.Normalization runtime decreases by over 10%, and the total 'check-llvm' time on my 48-core box goes from 38s to just under 37s. llvm-svn: 247253
* [Support] Fix the always_inline attribute macro to not include theChandler Carruth2015-09-101-1/+1
| | | | | | | | 'inline' specifier. That specifier may or may not be valid for a given function, or it may be required for correct linkage even when the compiler doesn't support the always_inline attribute. llvm-svn: 247252
* [OPENMP] Outlined function for parallel and other regions with list of ↵Alexey Bataev2015-09-1031-616/+542
| | | | | | | | | captured variables. Currently all variables used in OpenMP regions are captured into a record and passed to outlined functions in this record. It may result in some poor performance because of too complex analysis later in optimization passes. Patch makes to emit outlined functions for parallel-based regions with a list of captured variables. It reduces code for 2*n GEPs, stores and loads at least. Codegen for task-based regions remains unchanged because runtime requires that all captured variables are passed in captured record. llvm-svn: 247251
* [ADT] Micro-optimize the Triple constructor by doing a single split andChandler Carruth2015-09-101-8/+21
| | | | | | | | | | | | re-using the resulting components rather than repeatedly splitting and re-splitting to compute each component as part of the initializer list. This is more work on PR23676. Sadly, it doesn't help much. It removes the constructor from my profile, but doesn't make a sufficient dent in the total time. But it should play together nicely with subsequent changes. llvm-svn: 247250
* [ADT] Fix a confusing interface spec and some annoying peculiaritiesChandler Carruth2015-09-103-33/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | with the StringRef::split method when used with a MaxSplit argument other than '-1' (which nobody really does today, but which should actually work). The spec claimed both to split up to MaxSplit times, but also to append <= MaxSplit strings to the vector. One of these doesn't make sense. Given the name "MaxSplit", let's go with it being a max over how many *splits* occur, which means the max on how many strings get appended is MaxSplit+1. I'm not actually sure the implementation correctly provided this logic either, as it used a really opaque loop structure. The implementation was also playing weird games with nullptr in the data field to try to rely on a totally opaque hidden property of the split method that returns a pair. Nasty IMO. Replace all of this with what is (IMO) simpler code that doesn't use the pair returning split method, and instead just finds each separator and appends directly. I think this is a lot easier to read, and it most definitely matches the spec. Added some tests that exercise the corner cases around StringRef() and StringRef("") that all now pass. I'll start using this in code in the next commit. llvm-svn: 247249
* [MS ABI] Select a pointer to member representation more oftenDavid Majnemer2015-09-104-1/+27
| | | | | | | | | Given a reference to a pointer to member whose class's inheritance model is unspecified, make sure we come up with an inheritance model in plausible places. One place we were missing involved LValue to RValue conversion, another involved unary type traits. llvm-svn: 247248
* GlobalsAAResult(&&): Move every members.NAKAMURA Takumi2015-09-101-1/+6
| | | | | | Or, one of MSVC builders failed with unexpected behavior. llvm-svn: 247247
* Added isUndef() interface for SDNodeElena Demikhovsky2015-09-101-0/+7
| | | | | | Differential Revision: http://reviews.llvm.org/D12720 llvm-svn: 247246
* [ADT] Switch a bunch of places in LLVM that were doing single-characterChandler Carruth2015-09-109-16/+16
| | | | | | | splits to actually use the single character split routine which does less work, and in a debug build is *substantially* faster. llvm-svn: 247245
* [ADT] Add a single-character version of the small vector split routineChandler Carruth2015-09-104-1/+43
| | | | | | | | | | | on StringRef. Finding and splitting on a single character is substantially faster than doing it on even a single character StringRef -- we immediately get to a *very* tuned memchr call this way. Even nicer, we get to this even in a debug build, shaving 18% off the runtime of TripleTest.Normalization, helping PR23676 some more. llvm-svn: 247244
* Add a way to skip the Go bindings tests even when Go is configured inChandler Carruth2015-09-104-0/+6
| | | | | | | | | | | | | CMake. The Go bindings tests in an unoptimized build take over 30 seconds for me, making it the slowest test in 'check-llvm' by a factor of two. I've only rigged this up fully to the CMake build. If someone is interested in rigging it up to the autoconf build, they're welcome to do so. llvm-svn: 247243
* [ScalarEvolution] Fix PR24757.Sanjoy Das2015-09-102-2/+77
| | | | | | | | | | | | | | | | | | | Summary: PR24757 was caused by some incorect math in `ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X in 2^N * A = 2^N * X is not necessarily A. Reviewers: atrick, majnemer, meheff Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12721 llvm-svn: 247242
* [LPM] Simplify this code and fix a compile error for compilers thatChandler Carruth2015-09-101-3/+1
| | | | | | | | don't correctly implement the scoping rules of C++11 range based for loops. This kind of aliasing isn't a good idea anyways (and wasn't really intended). llvm-svn: 247241
* [LPM] Use a map from analysis ID to immutable passes in the legacy passChandler Carruth2015-09-102-22/+28
| | | | | | | | | | | | manager to avoid a slow linear scan of every immutable pass and on every attempt to find an analysis pass. This speeds up 'check-llvm' on an unoptimized build for me by 15%, YMMV. It should also help (a tiny bit) other folks that are really bottlenecked on repeated runs of tiny pass pipelines across small IR files. llvm-svn: 247240
* CFI: Add diagnostic handler and tests for indirect call checker.Peter Collingbourne2015-09-105-1/+91
| | | | | | Differential Revision: http://reviews.llvm.org/D11858 llvm-svn: 247239
* CFI: Introduce -fsanitize=cfi-icall flag.Peter Collingbourne2015-09-1019-126/+230
| | | | | | | | | | This flag causes the compiler to emit bit set entries for functions as well as runtime bitset checks at indirect call sites. Depends on the new function bitset mechanism. Differential Revision: http://reviews.llvm.org/D11857 llvm-svn: 247238
* Enable the shrink wrapping optimization for PPC64.Kit Barton2015-09-104-77/+645
| | | | | | | | | | | | | | The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function 2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function 3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run: Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64. Phabricator review: http://reviews.llvm.org/D11817 llvm-svn: 247237
* [AArch64] Match FI+offset in STNP addressing mode.Ahmed Bougacha2015-09-103-6/+28
| | | | | | | | | | | | | | | First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236
* [MC] Convert all the remaining tests from macho-dump to llvm-readobj.Davide Italiano2015-09-1037-4561/+5023
| | | | | | | | | This sort-of deprecates macho-dump. It may take still a little while to garbage collect it, but at least there's no real usage of it in the tree anymore. New tests should always rely on llvm-readobj or llvm-objdump. llvm-svn: 247235
* [AArch64] Match base+offset in STNP addressing mode.Ahmed Bougacha2015-09-102-14/+181
| | | | | | Followup to r247231. llvm-svn: 247234
* EmitRecord* API change: accepts ArrayRef instead of a SmallVector (NFC)Mehdi Amini2015-09-104-236/+148
| | | | | | | | | | | | | | | This reapply a variant commit r247179 after post-commit review from D.Blaikie. Hopefully I got it right this time: lifetime of initializer list ends as with any expression, which make invalid the pattern: ArrayRef<int> Arr = { 1, 2, 3, 4}; Just like StringRef, ArrayRef shouldn't be used to initialize local variable but only as function argument. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247233
* Makes EmitRecord() accepting ArrayRef and raw array (NFC)Mehdi Amini2015-09-101-5/+5
| | | | | | | | After r247186, a vector is no longer needed as the push_front for the code is removed. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247232
* [AArch64] Support selecting STNP.Ahmed Bougacha2015-09-104-0/+270
| | | | | | | | | | | | | | | | | | We could go through the load/store optimizer and match STNP where we would have matched a nontemporal-annotated STP, but that's not reliable enough, as an opportunistic optimization. Insetad, we can guarantee emitting STNP, by matching them at ISel. Since there are no single-input nontemporal stores, we have to resort to some high-bits-extracting trickery to generate an STNP from a plain store. Also, we need to support another, LDP/STP-specific addressing mode, base + signed scaled 7-bit immediate offset. For now, only match the base. Let's make it smart separately. Part of PR24086. llvm-svn: 247231
* AMDGPU/SI: Fix more cases of losing exec operandsMatt Arsenault2015-09-103-16/+12
| | | | llvm-svn: 247230
* AMDGPU/SI: Fix creating v_mov_b32s without exec usesMatt Arsenault2015-09-101-2/+14
| | | | | | | This will be caught by existing tests with a verifier check to be added in a future commit. llvm-svn: 247229
* Don't crash when emitting a block under returns_nonnull.John McCall2015-09-102-2/+15
| | | | | | rdar://22071955 llvm-svn: 247228
* Removed debug prints that I accidentally left in.Greg Clayton2015-09-101-6/+0
| | | | llvm-svn: 247227
* Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"Hans Wennborg2015-09-109-94/+95
| | | | | | | This caused build breakges, e.g. http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926 llvm-svn: 247226
* [CodeGen] Make x86 nontemporal store patfrags generic. NFC.Ahmed Bougacha2015-09-102-19/+18
| | | | | | To be used by other targets. llvm-svn: 247225
* On MacOSX, revamp the way we link against the llvm/clang .a files by making ↵Greg Clayton2015-09-101-38/+39
| | | | | | a text file that contains all .a filenames and use that when linking in Xcode. llvm-svn: 247224
* [RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC]Philip Reames2015-09-101-8/+1
| | | | llvm-svn: 247223
* Revert r247218: "Fix Clang-tidy misc-use-override warnings, other minor fixes"Hans Wennborg2015-09-106-32/+30
| | | | | | | | | | | Seems it broke the Polly build. From http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/11687/steps/compile/logs/stdio: In file included from /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/lib/TableGen/Record.cpp:14:0: /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:369:3: error: looser throw specifier for 'virtual llvm::TypedInit::~TypedInit()' /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:270:11: error: overriding 'virtual llvm::Init::~Init() noexcept (true)' llvm-svn: 247222
* [RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC]Philip Reames2015-09-101-3/+3
| | | | | | The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation. That's not true. The tighter form of the assert is useful documentation. llvm-svn: 247221
* [RewriteStatepointsForGC] One last bit of naming [NFCI]Philip Reames2015-09-101-7/+7
| | | | llvm-svn: 247220
* [WinEH] Add codegen support for cleanuppad and cleanupretReid Kleckner2015-09-1013-70/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219
* Fix Clang-tidy misc-use-override warnings, other minor fixesHans Wennborg2015-09-106-30/+32
| | | | | | | | Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12741 llvm-svn: 247218
* [RewriteStatepointsForGC] Further style/naming fixup [NFCI]Philip Reames2015-09-101-26/+26
| | | | llvm-svn: 247217
* Fix Clang-tidy misc-use-override warnings, other minor fixesHans Wennborg2015-09-109-95/+94
| | | | | | | | Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12740 llvm-svn: 247216
* Bitcode Writer: EmitRecordWith* takes an ArrayRef instead of a SmallVector (NFC)Mehdi Amini2015-09-101-21/+22
| | | | | | | | This reapply commit r247178 after post-commit review from D.Blaikie in a way that makes it compatible with the existing API. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247215
* Add makeArrayRef() overload for ArrayRef input (no-op/identity) NFCMehdi Amini2015-09-102-0/+26
| | | | | | | | | | | | | | | The purpose is to allow templated wrapper to work with either ArrayRef or any convertible operation: template<typename Container> void wrapper(const Container &Arr) { impl(makeArrayRef(Arr)); } with Container being a std::vector, a SmallVector, or an ArrayRef. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247214
* [RewriteStatepointsForGC] More naming cleanup [NFCI]Philip Reames2015-09-101-6/+6
| | | | llvm-svn: 247213
* [RewriteStatepointsForGC] Code cleanup [NFC]Philip Reames2015-09-091-25/+26
| | | | | | Factor out common code related to naming values, fix a small style issue. More to follow in separate changes. llvm-svn: 247211
* [RewriteStatepointsForGC] Extend base pointer inference to handle insertelementPhilip Reames2015-09-092-59/+144
| | | | | | | | | | | | This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done. Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time. Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered *all* insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well. Differential Revision: http://reviews.llvm.org/D12583 llvm-svn: 247210
OpenPOWER on IntegriCloud