summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "Increase stack size for stack-use-after-return test"Francis Ricci2017-03-021-1/+1
| | | | | | | | Reverting due to failures on aarch64 This reverts commit f8ff7e585134196e8482e4dd8752cd4c22cf027a. llvm-svn: 296719
* Revert r296708; causing test failures on ARM hosts.Eli Friedman2017-03-022-37/+25
| | | | | | | | | | | | | | | Original commit message: [ARM] Fix insert point for store rescheduling. In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would sink stores for no reason. llvm-svn: 296718
* Fix various warnings. NFCZachary Turner2017-03-0212-379/+22
| | | | llvm-svn: 296717
* Fix python 3 syntax error in sym_diffEric Fiselier2017-03-021-1/+1
| | | | llvm-svn: 296716
* Cleanup new/delete definitionsEric Fiselier2017-03-013-67/+71
| | | | | | | | | | | | | | | | | | | | This patch cleans up how libc++abi handles the definitions for new/delete. It is in preperation for upcoming changes to fix how both libc++ and libc++abi handle new/delete. The primary changes in this patch are: * Move the definitions for bad_array_length and bad_new_array_length into stdlib_exception.cpp. This way stdlib_new_delete.cpp only contains new/delete. * Rename cxa_new_delete.cpp -> stdlib_new_delete.cpp for consistency with other files. * Add a FIXME regarding when stdlib_new_delete.cpp is actually compiled as part of the dylib. llvm-svn: 296715
* [Support] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-03-016-142/+156
| | | | | | other minor fixes (NFC). llvm-svn: 296714
* Remove spurious use of LLVM_FALLTHROUGH (NFC)Paul Robinson2017-03-011-43/+17
| | | | llvm-svn: 296713
* Fix Apple-specific XFAIL directive in libc++ testMehdi Amini2017-03-011-1/+1
| | | | | | | | This tests is failing in XCode 7.0. But Xcode 7.3 that shipped an updated clang has this test passing. This is fixing green dragon which runs this configuration. llvm-svn: 296712
* [DAGCombiner] mulhi + 1 never overflow.Amaury Sechet2017-03-012-4/+15
| | | | | | | | | | | | | | | Summary: This can be used to optimize large multiplications after legalization. Depends on D29565 Reviewers: mkuper, spatel, RKSimon, zvi, bkramer, aaboud, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29587 llvm-svn: 296711
* [GlobalISel] Add a way for targets to enable GISel.Ahmed Bougacha2017-03-015-4/+74
| | | | | | | | | | | | | | | | | | | | | | | Until now, we've had to use -global-isel to enable GISel. But using that on other targets that don't support it will result in an abort, as we can't build a full pipeline. Additionally, we want to experiment with enabling GISel by default for some targets: we can't just enable GISel by default, even among those target that do have some support, because the level of support varies. This first step adds an override for the target to explicitly define its level of support. For AArch64, do that using a new command-line option (I know..): -aarch64-enable-global-isel-at-O=<N> Where N is the opt-level below which GISel should be used. Default that to -1, so that we still don't enable GISel anywhere. We're not there yet! While there, remove a couple LLVM_UNLIKELYs. Building the pipeline is such a cold path that in practice that shouldn't matter at all. llvm-svn: 296710
* Improve mulhi overflow test. NFCAmaury Sechet2017-03-011-14/+25
| | | | llvm-svn: 296709
* [ARM] Fix insert point for store rescheduling.Eli Friedman2017-03-012-25/+37
| | | | | | | | | | | | | In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would sink stores for no reason. Differential Revision: https://reviews.llvm.org/D30124 llvm-svn: 296708
* Use pthreads for thread-local lsan allocator cache on darwinFrancis Ricci2017-03-014-44/+56
| | | | | | | | | | | | | | | Summary: This patch allows us to move away from using __thread on darwin, which is requiring for building lsan for darwin on ios version 7 and on iossim i386. Reviewers: kubamracek, kcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29994 llvm-svn: 296707
* Increase stack size for stack-use-after-return testFrancis Ricci2017-03-011-1/+1
| | | | | | | | | | | | | | Summary: The current size is flaky, as revealed by checking the stack size attr after setting it. Reviewers: kubamracek, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30267 llvm-svn: 296706
* Add polly to svn:ignore.Eli Friedman2017-03-010-0/+0
| | | | llvm-svn: 296705
* Fix Apple-specific XFAIL directive in libc++ testMehdi Amini2017-03-014-4/+4
| | | | | | | | | These tests are failing in XCode 8.0, 8.1, and 8.2, but not in Xcode 8.3. Annoyingly the version numbering for clang does not follow Xcode and is bumped to 8.1 only in Xcode 8.3. So Xfailing apple-clang-8.0 should catch all cases here. llvm-svn: 296704
* Add missing test dependency.Peter Collingbourne2017-03-012-1/+3
| | | | llvm-svn: 296703
* ELF: Add ThinLTO caching support.Peter Collingbourne2017-03-017-5/+54
| | | | | | | | | | This patch adds an option named --thinlto-cache-dir, which specifies the path to a directory in which to cache native object files for ThinLTO incremental builds. Differential Revision: https://reviews.llvm.org/D30509 llvm-svn: 296702
* [ARM] Check correct instructions for load/store rescheduling.Eli Friedman2017-03-014-17/+143
| | | | | | | | | | | | | | | | | | | | This code starts from the high end of the sorted vector of offsets, and works backwards: it tries to find contiguous offsets, process them, then pops them from the end of the vector. Most of the code agrees with this order of processing, but one loop doesn't: it instead processes elements from the low end of the vector (which are nodes with unrelated offsets). Fix that loop to process the correct elements. This has a few implications. One, we don't incorrectly return early when processing multiple groups of offsets in the same block (which allows rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert point for loads, so they're correctly sorted (which affects the scheduling of vldm-liveness.ll). I think it might also impact some of the heuristics slightly. Differential Revision: https://reviews.llvm.org/D30368 llvm-svn: 296701
* Split GdbIndexBuilder class into non-member functions.Rui Ueyama2017-03-014-125/+85
| | | | | | | | | | | | That class had three member functions, and all of them are just reader methods that did not depend on class members, so they can be just non- member functions. Probably we should reorganize the functions themselves because their return types doesn't make much sense to me, but for now I just moved these functions out of the class. llvm-svn: 296700
* [DAGCombiner] fold binops with constant into select-of-constantsSanjay Patel2017-03-017-335/+346
| | | | | | | | | | | | | | | | | | This is part of the ongoing attempt to improve select codegen for all targets and select canonicalization in IR (see D24480 for more background). The transform is a subset of what is done in InstCombine's FoldOpIntoSelect(). I first noticed a regression in the x86 avx512-insert-extract.ll tests with a patch that hopes to convert more selects to basic math ops. This appears to be a general missing DAG transform though, so I added tests for all standard binops in rL296621 (PowerPC was chosen semi-randomly; it has scripted FileCheck support, but so do ARM and x86). The poor output for "sel_constants_shl_constant" is tracked with: https://bugs.llvm.org/show_bug.cgi?id=32105 Differential Revision: https://reviews.llvm.org/D30502 llvm-svn: 296699
* [Constant Hoisting] Avoid inserting instructions before EH padsReid Kleckner2017-03-012-2/+72
| | | | | | | | | | | | | Now that terminators can be EH pads, this code needs to iterate over the immediate dominators of the EH pad to find a valid insertion point. Fix for PR32107 Patch by Robert Olliff! Differential Revision: https://reviews.llvm.org/D30511 llvm-svn: 296698
* [MC] Fix MachineLocation constructor broken in r294685 (NFC).Eugene Zelenko2017-03-011-1/+1
| | | | | | Problem spotted by Frej Drejhammar. llvm-svn: 296697
* Add test case for mulhi's overflow. NFCAmaury Sechet2017-03-011-0/+55
| | | | llvm-svn: 296696
* Remove useless variables and declarations.Rui Ueyama2017-03-013-11/+3
| | | | llvm-svn: 296695
* Replace `auto` with its real type.Rui Ueyama2017-03-011-1/+1
| | | | llvm-svn: 296694
* Make it clear what you should modify when you copy any of these sampleJim Ingham2017-03-012-5/+12
| | | | | | test cases. llvm-svn: 296693
* Add a test to ensure that SBFrame::Disassemble produces some output.Jim Ingham2017-03-012-1/+69
| | | | llvm-svn: 296692
* [DebugInfo] [DWARFv5] Unique abbrevs for DIEs with different implicit_const ↵Victor Leschuk2017-03-012-2/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | values Take DW_FORM_implicit_const attribute value into account when profiling DIEAbbrevData. Currently if we have two similar types with implicit_const attributes and different values we end up with only one abbrev in .debug_abbrev section. For example consider two structures: S1 with implicit_const attribute ATTR and value VAL1 and S2 with implicit_const ATTR and value VAL2. The .debug_abbrev section will contain only 1 related record: [N] DW_TAG_structure_type DW_CHILDREN_yes DW_AT_ATTR DW_FORM_implicit_const VAL1 // .... This is incorrect as struct S2 (with VAL2) will use abbrev record with VAL1. With this patch we will have two different abbreviations here: [N] DW_TAG_structure_type DW_CHILDREN_yes DW_AT_ATTR DW_FORM_implicit_const VAL1 // .... [M] DW_TAG_structure_type DW_CHILDREN_yes DW_AT_ATTR DW_FORM_implicit_const VAL2 // .... llvm-svn: 296691
* [DAGCombiner] Remove non-ascii character and reflow comment.Benjamin Kramer2017-03-011-5/+4
| | | | llvm-svn: 296690
* Style fix.Rui Ueyama2017-03-011-8/+5
| | | | llvm-svn: 296689
* Reduce nesting. NFC.Rui Ueyama2017-03-011-7/+8
| | | | llvm-svn: 296688
* Do not inherit LoadedObjectInfo.Rui Ueyama2017-03-012-27/+19
| | | | | | GdbIndexBuilder class inherited LoadedObjectInfo, but that's not necessary. llvm-svn: 296687
* Inline a function that is too short to be an independent function.Rui Ueyama2017-03-012-12/+8
| | | | llvm-svn: 296686
* Generate the test configuration even when LIBCXX_INCLUDE_TESTS=OFF.Eric Fiselier2017-03-012-9/+14
| | | | | | | | | | This patch changes the CMake configuration so that it always generates the test/lit.site.cfg file, even when testing is disabled. This allows users to test libc++ without requiring them to have a full LLVM checkout on their machine. llvm-svn: 296685
* LIU:::Query: Query LiveRange instead of LiveInterval; NFCMatthias Braun2017-03-014-42/+45
| | | | | | | | | - We only need the information from the base class, not the additional details in the LiveInterval class. - Spread more `const` - Some code cleanup llvm-svn: 296684
* Elide argument copies during instruction selectionReid Kleckner2017-03-0116-91/+689
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Avoids tons of prologue boilerplate when arguments are passed in memory and left in memory. This can happen in a debug build or in a release build when an argument alloca is escaped. This will dramatically affect the code size of x86 debug builds, because X86 fast isel doesn't handle arguments passed in memory at all. It only handles the x86_64 case of up to 6 basic register parameters. This is implemented by analyzing the entry block before ISel to identify copy elision candidates. A copy elision candidate is an argument that is used to fully initialize an alloca before any other possibly escaping uses of that alloca. If an argument is a copy elision candidate, we set a flag on the InputArg. If the the target generates loads from a fixed stack object that matches the size and alignment requirements of the alloca, the SelectionDAG builder will delete the stack object created for the alloca and replace it with the fixed stack object. The load is left behind to satisfy any remaining uses of the argument value. The store is now dead and is therefore elided. The fixed stack object is also marked as mutable, as it may now be modified by the user, and it would be invalid to rematerialize the initial load from it. Supersedes D28388 Fixes PR26328 Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29668 llvm-svn: 296683
* New tool: opt-stats.pyAdam Nemet2017-03-013-188/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am planning to use this tool to find too noisy (missed) optimization remarks. Long term it may actually be better to just have another tool that exports the remarks into an sqlite database and perform queries like this in SQL. This splits out the YAML parsing from opt-viewer.py into a new Python module optrecord.py. This is the result of the script on the LLVM testsuite: Total number of remarks 714433 Top 10 remarks by pass: inline 52% gvn 24% licm 13% loop-vectorize 5% asm-printer 3% loop-unroll 1% regalloc 1% inline-cost 0% slp-vectorizer 0% loop-delete 0% Top 10 remarks: gvn/LoadClobbered 20% inline/Inlined 19% inline/CanBeInlined 18% inline/NoDefinition 9% licm/LoadWithLoopInvariantAddressInvalidated 6% licm/Hoisted 6% asm-printer/InstructionCount 3% inline/TooCostly 3% gvn/LoadElim 3% loop-vectorize/MissedDetails 2% Beside some refactoring, I also changed optrecords not to use context to access global data (max_hotness). Because of the separate module this would have required splitting context into two. However it's not possible to access the optrecord context from the SourceFileRenderer when calling back to Remark.RelativeHotness. llvm-svn: 296682
* Re-enable BinaryStreamTest.StreamReaderObject.Zachary Turner2017-03-011-10/+18
| | | | | | | This was failing because I was using memcmp to compare two objects that included padding bytes, which were uninitialized. llvm-svn: 296681
* Unbreak Windows bots.Rui Ueyama2017-03-011-1/+1
| | | | llvm-svn: 296680
* [ScopInfo] Disable memory folding in case it results in multi-disjunct relationsTobias Grosser2017-03-015-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multi-disjunct access maps can easily result in inbound assumptions which explode in case of many memory accesses and many parameters. This change reduces compilation time of some larger kernel from over 15 minutes to less than 16 seconds. Interesting is the test case test/ScopInfo/multidim_param_in_subscript.ll which has a memory access [n] -> { Stmt_for_body3[i0, i1] -> MemRef_A[i0, -1 + n - i1] } which requires folding, but where only a single disjunct remains. We can still model this test case even when only using limited memory folding. For people only reading commit messages, here the comment that explains what memory folding is: To recover memory accesses with array size parameters in the subscript expression we post-process the delinearization results. We would normally recover from an access A[exp0(i) * N + exp1(i)] into an array A[][N] the 2D access A[exp0(i)][exp1(i)]. However, another valid delinearization is A[exp0(i) - 1][exp1(i) + N] which - depending on the range of exp1(i) - may be preferrable. Specifically, for cases where we know exp1(i) is negative, we want to choose the latter expression. As we commonly do not have any information about the range of exp1(i), we do not choose one of the two options, but instead create a piecewise access function that adds the (-1, N) offsets as soon as exp1(i) becomes negative. For a 2D array such an access function is created by applying the piecewise map: [i,j] -> [i, j] : j >= 0 [i,j] -> [i-1, j+N] : j < 0 After this patch we generate only the first case, except for situations where we can proove the first case to be invalid and can consequently select the second without introducing disjuncts. llvm-svn: 296679
* Don't implement the gdb hash table as a generic in-memory hash table.Rui Ueyama2017-03-013-45/+27
| | | | | | | | | | | | | | | | | | Looks like .gdb.index and its support classes do things that they don't have to or shouldn't do do. This patch addresses one of these issues. GdbHashTab class is a hash table class. Just like other in-memory hash tables, that incrementally updates its internal data and resizes buckets as new elements are added so that key lookup is always fast. But that is completely not necessary. Unlike debuggers, we only produce hash tables for .gdb.index and never read them. So incrementally updating a hash table in memory is just a waste of resource and complicates the code. What we should do is to accumulate symbols and then create the final hash table at once. llvm-svn: 296678
* [APInt] Optimize APInt creation from uint64_tCraig Topper2017-03-012-3/+5
| | | | | | | | | | | | | | | | | Summary: This patch moves the clearUnusedBits calls into the two different initialization paths for APInt from a uint64_t. This allows the compiler to better optimize the clearing of the unused bits for the single word case. And it puts the clearing for the multi word case into the initSlowCase function to save code. In the common case of initializing with 0 this allows the clearing to be completely optimized out for the single word case. On my local x86 build this is showing a ~45kb reduction in the size of the opt binary. Reviewers: RKSimon, hans, majnemer, davide, MatzeB Reviewed By: hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30486 llvm-svn: 296677
* LIU::Query: Remove unused getter; NFCMatthias Braun2017-03-011-5/+0
| | | | llvm-svn: 296676
* LIU::Query: Remove always false member+getter; NFCMatthias Braun2017-03-012-7/+0
| | | | llvm-svn: 296675
* LiveIntervalUnion: Remove unused functions; NFCMatthias Braun2017-03-011-6/+0
| | | | | | | Remove two unused functions that are in fact bad API and should not be called anyway. llvm-svn: 296674
* [InstCombine] use -instnamer and auto-generate complete checks; NFCSanjay Patel2017-03-011-117/+206
| | | | llvm-svn: 296673
* Disable BinaryStreamTest.StreamReaderObject.Zachary Turner2017-03-011-1/+1
| | | | llvm-svn: 296672
* [x86] add vector tests for more coverage of D30502; NFCSanjay Patel2017-03-011-0/+35
| | | | llvm-svn: 296671
* Improve scheduling with branch coalescingNemanja Ivanovic2017-03-017-0/+799
| | | | | | | | | | | This patch adds a MachineSSA pass that coalesces blocks that branch on the same condition. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D28249 llvm-svn: 296670
OpenPOWER on IntegriCloud