summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [AVX512] Remove space before \t in AsmStrings.Adam Nemet2014-10-011-6/+6
| | | | llvm-svn: 218725
* [x86] Teach the new vector shuffle lowering about VBROADCAST andChandler Carruth2014-10-0110-263/+437
| | | | | | | | | | VPBROADCAST. This has the somewhat expected pervasive impact. I don't know why I forgot about this. Everything seems good with lots of significant improvements in the tests. llvm-svn: 218724
* llvm-cov/CoverageReport.cpp: Quick fix for msvcrt, since width specifier "z" ↵NAKAMURA Takumi2014-10-011-12/+12
| | | | | | | | is unavailable. Note, mingw uses its own printf instead of msvcrt. llvm-svn: 218723
* llvm/test/DebugInfo/X86/gmlt.test: Get rid of %llc_dwarf. It should not be ↵NAKAMURA Takumi2014-10-011-2/+1
| | | | | | | | used with -mtriple. Also, remove object-emission. test/DebugInfo/X86 doesn't require it. llvm-svn: 218722
* [InstCombine] Optimize icmp-select-icmpGerolf Hoflehner2014-10-015-9/+277
| | | | | | | | | | | | | | | | | | | | | | | | | | | In special cases select instructions can be eliminated by replacing them with a cheaper bitwise operation even when the select result is used outside its home block. The instances implemented are patterns like %x=icmp.eq %y=select %x,%r, null %z=icmp.eq|neq %y, null br %z,true, false ==> %x=icmp.ne %y=icmp.eq %r,null %z=or %x,%y br %z,true,false The optimization is integrated into the instruction combiner and performed only when all uses of the select result can be replaced by the select operand proper. For this dominator information is used and dominance is now a required analysis pass in the combiner. The optimization itself is iterative. The critical step is to replace the select result with the non-constant select operand. So the select becomes local and the combiner iteratively works out simpler code pattern and eventually eliminates the select. rdar://17853760 llvm-svn: 218721
* Update uninitialized tests to ensure that field initialization has theRichard Trieu2014-09-301-29/+40
| | | | | | same coverage as the global checker. llvm-svn: 218720
* Omit DW_AT_inline under -gmlt to save a little more space.David Blaikie2014-09-302-2/+2
| | | | llvm-svn: 218719
* [mach-o] Implement -demangle.Nick Kledzik2014-09-307-2/+127
| | | | | | | | | | | | | | The darwin linker has the -demangle option which directs it to demangle C++ (and soon Swift) mangled symbol names. Long term we need some Diagnostics object for formatting errors and warnings. But for now we have the Core linker just writing messages to llvm::errs(). So, to enable demangling, I changed the Resolver to call a LinkingContext method on the symbol name. To make this more interesting, the demangling code is done via __cxa_demangle() which is part of the C++ ABI, which is only supported on some platforms, so I had to conditionalize the code with the config generated HAVE_CXXABI_H. llvm-svn: 218718
* Enable both C and C++ modules with -fmodules, by switching -fcxx-modules toRichard Smith2014-09-303-19/+16
| | | | | | | | | | | | | | | | | being on by default. -fno-cxx-modules can still be used to enable C modules but not C++ modules, but C++ modules is not significantly less stable than C modules any more. Also remove some of the scare words from the modules documentation. We're certainly not going to remove modules support (though we might change the interface), and it works well enough to bootstrap and build lots of non-trivial code. Note that this does not represent a commitment to the current interface nor implementation, and we still intend to follow whatever direction the C and C++ committees take regarding modules support. llvm-svn: 218717
* [compiler-rt] Re-enable the use of -gmlt for ASan tests on DarwinKuba Brecka2014-09-301-6/+3
| | | | | | | The optimization for -gmlt/-gline-tables-only introduced in r218129 happened to break on Darwin and produce no line number information due to an incompatibility with dsymutil. ASan tests have been failing because of that and we disabled the use of -gmlt for the tests in r218545. This patch re-enables the use of -gmlt, because we have conditionally disabled the incompatible optimization in LLVM, so -gmlt now works on Darwin. Once Darwin's dsymutil is modified to allow this optimization, we can re-enable the optimization in LLVM. llvm-svn: 218716
* Update -Wuninitialized to be stricter on CK_NoOp casts.Richard Trieu2014-09-302-5/+26
| | | | llvm-svn: 218715
* [BasicAA] Make better use of zext and sign informationHal Finkel2014-09-303-2/+95
| | | | | | | | | | | | | | | | | Two related things: 1. Fixes a bug when calculating the offset in GetLinearExpression. The code previously used zext to extend the offset, so negative offsets were converted to large positive ones. 2. Enhance aliasGEP to deduce that, if the difference between two GEP allocations is positive and all the variables that govern the offset are also positive (i.e. the offset is strictly after the higher base pointer), then locations that fit in the gap between the two base pointers are NoAlias. Patch by Nick White! llvm-svn: 218714
* DebugInfo: Sink the code emitting DW_AT_APPLE_omit_frame_ptr down to a more ↵David Blaikie2014-09-302-7/+5
| | | | | | | | | | | common spot. No functional change. Pre-emptive refactoring before I start pushing some of this subprogram creation down into DWARFCompileUnit so I can build different subprograms in the skeleton unit from the dwo unit for adding -gmlt-like data to the skeleton. llvm-svn: 218713
* MSBuild integration: fix the loop in install.batHans Wennborg2014-09-302-8/+18
| | | | | | | It would previously not continue the platforms loop unless it could find the latest toolset directory. llvm-svn: 218712
* [SimplifyCFG] threshold for folding branches with common destinationJingyue Wu2014-09-305-75/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds a threshold that controls the number of bonus instructions allowed for folding branches with common destination. The original code allows at most one bonus instruction. With this patch, users can customize the threshold to allow multiple bonus instructions. The default threshold is still 1, so that the code behaves the same as before when users do not specify this threshold. The motivation of this change is that tuning this threshold significantly (up to 25%) improves the performance of some CUDA programs in our internal code base. In general, branch instructions are very expensive for GPU programs. Therefore, it is sometimes worth trading more arithmetic computation for a more straightened control flow. Here's a reduced example: __global__ void foo(int a, int b, int c, int d, int e, int n, const int *input, int *output) { int sum = 0; for (int i = 0; i < n; ++i) sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i]; *output = sum; } The select statement in the loop body translates to two branch instructions "if ((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination. With the default threshold, SimplifyCFG is unable to fold them, because computing the condition of the second branch "(i | c) ^ d > e" requires two bonus instructions. With the threshold increased, SimplifyCFG can fold the two branches so that the loop body contains only one branch, making the code conceptually look like: sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i]; Increasing the threshold significantly improves the performance of this particular example. In the configuration where both conditions are guaranteed to be true, increasing the threshold from 1 to 2 improves the performance by 18.24%. Even in the configuration where the first condition is false and the second condition is true, which favors shortcuts, increasing the threshold from 1 to 2 still improves the performance by 4.35%. We are still looking for a good threshold and maybe a better cost model than just counting the number of bonus instructions. However, according to the above numbers, we think it is at least worth adding a threshold to enable more experiments and tuning. Let me know what you think. Thanks! Test Plan: Added one test case to check the threshold is in effect Reviewers: nadav, eliben, meheff, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D5529 llvm-svn: 218711
* [x86] Add AVX1 and AVX2 testing to all of the 128-bit shuffle testChandler Carruth2014-09-304-375/+855
| | | | | | | | | | | | | | cases. While clearly we don't need the AVX vector width, these ISA extensions often cause us to select different instructions and we should cover them even with the narrow vector width. Also, while here, nuke the stress_test2 contents. There is no reason to try to FileCheck this entire body when it is mostly a test for successfully surviving the code generator. llvm-svn: 218710
* [x86] Update the exact FileCheck syntax of the 256-bit and 512-bitChandler Carruth2014-09-305-1961/+1962
| | | | | | | | | | | shuffle tests to match that used in the script I posted and now used consistently in 128-bit tests. Nothing interesting changing here, just using the label name as the FileCheck label and a slightly more general comment marker consumption strategy. llvm-svn: 218709
* Adjust test case addition in r218702 so as not to fail when the X86 target ↵David Blaikie2014-09-303-2/+5
| | | | | | isn't built. llvm-svn: 218708
* [x86] Rework all of the 128-bit vector shuffle tests with my handy testChandler Carruth2014-09-304-1222/+2541
| | | | | | | | | | | | | | | | | | | | | | | | | updating script so that they are more thorough and consistent. Specific fixes here include: - Actually test VEX-encoded AVX mnemonics. - Actually use an SSE 4.1 run to test SSE 4.1 features! - Correctly check instructions sequences from the start of the function. - Elide the shuffle operands and comment designator in a consistent way. - Test all of the architectures instead of just the ones I was motivated to manually author. I've gone back through and fixed up any egregious issues I spotted. Let me know if I missed something you really dislike. One downside to this is that we're now not as diligently using FileCheck variables for registers. I would be much more concerned with this if we had larger register usage, but there just aren't that interesting of register choices here and most of the registers are constrained by the ABI. Ultimately, I don't think this is likely to be the maintenance burden for these tests and updating them again should be staright forward. llvm-svn: 218707
* [PECOFF] Fix /entry option.Rui Ueyama2014-09-303-3/+22
| | | | | | | | This is yet another edge case of ambiguous name resolution. When a symbol is specified with /entry:SYM, SYM may be resolved to the C++ mangled function name (?SYM@@YAXXZ). llvm-svn: 218706
* [PECOFF] Move helper function out of classRui Ueyama2014-09-303-33/+55
| | | | | | No functionality change intended. llvm-svn: 218705
* [mach-o] add file comment to compact unwind passTim Northover2014-09-301-1/+3
| | | | llvm-svn: 218704
* [mach-o] create __unwind_info section on x86_64Tim Northover2014-09-3017-13/+669
| | | | | | | | | | | This is a minimally useful pass to construct the __unwind_info section in a final object from the various __compact_unwind inputs. Currently it doesn't produce any compressed pages, only works for x86_64 and will fail if any function ends up without __compact_unwind. rdar://problem/18208653 llvm-svn: 218703
* Disable the -gmlt optimization implemented in r218129 under Darwin due to ↵David Blaikie2014-09-303-3/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | issues with dsymutil. r218129 omits DW_TAG_subprograms which have no inlined subroutines when emitting -gmlt data. This makes -gmlt very low cost for -O0 builds. Darwin's dsymutil reasonably considers a CU empty if it has no subprograms (which occurs with the above optimization in -O0 programs without any force_inline function calls) and drops the line table, CU, and everything in this situation, making backtraces impossible. Until dsymutil is modified to account for this, disable this optimization on Darwin to preserve the desired functionality. (see r218545, which should be reverted after this patch, for other discussion/details) Footnote: In the long term, it doesn't look like this scheme (of simplified debug info to describe inlining to enable backtracing) is tenable, it is far too size inefficient for optimized code (the DW_TAG_inlined_subprograms, even once compressed, are nearly twice as large as the line table itself (also compressed)) and we'll be considering things like Cary's two level line table proposal to encode all this information directly in the line table. llvm-svn: 218702
* Use the target-specified iteration count to opt out of any further ↵Sanjay Patel2014-09-301-60/+62
| | | | | | refinement of an estimate. NFC. llvm-svn: 218700
* Not all processes have a Dynamic Loader. Be sure to check that it exists ↵Jim Ingham2014-09-301-1/+4
| | | | | | | | before using it. <rdar://problem/18491391> llvm-svn: 218699
* Split the estimate() interface into separate functions for each type. NFC.Sanjay Patel2014-09-304-34/+61
| | | | | | | | | | | | It was hacky to use an opcode as a switch because it won't always match (rsqrte != sqrte), and it looks like we'll need to add more special casing per arch than I had hoped for. Eg, x86 will prefer a different NR estimate implementation. ARM will want to use it's 'step' instructions. There also don't appear to be any new estimate instructions in any arch in a long, long time. Altivec vloge and vexpte may have been the first and last in that field... llvm-svn: 218698
* InstrProf: Remove an unused member (NFC)Justin Bogner2014-09-301-6/+3
| | | | llvm-svn: 218697
* [PECOFF] Allow /export:<symbol>,PRTVATE.Rui Ueyama2014-09-302-4/+8
| | | | | | PRIVATE option is also an undocumented feature. llvm-svn: 218696
* [PECOFF] Fix /export option.Rui Ueyama2014-09-302-3/+24
| | | | | | | | | | | MSDN doesn't say about /export:foo=bar style option, but it turned out MSVC link.exe actually accepts that. So we need that too. It also means that the export directive in the module definition file and /export command line option are functionally equivalent. llvm-svn: 218695
* Avoid a crash after loading an #undef'd macro in code completionBen Langmuir2014-09-305-2/+18
| | | | | | | | | | | In code-completion, don't assume there is a MacroInfo for everything, since we aren't serializing the def corresponding to a later #undef in the same module. Also setup the HadMacro bit correctly for undefs to avoid an assertion failure. rdar://18416901 llvm-svn: 218694
* Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.Juergen Ributzka2014-09-302-54/+241
| | | | | | | | | | | | | | Note: This version fixed an issue with the TBZ/TBNZ instructions that were generated in FastISel. The issue was that the 64bit version of TBZ (TBZX) automagically sets the upper bit of the immediate field that is used to specify the bit we want to test. To test for any of the lower 32bits we have to first extract the subregister and use the 32bit version of the TBZ instruction (TBZW). Original commit message: Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218693
* R600/SI: Fix printing of clamp and omodMatt Arsenault2014-09-309-32/+70
| | | | | | | | No tests for omod since nothing uses it yet, but this should get rid of the remaining annoying trailing zeros after some instructions. llvm-svn: 218692
* R600/SI: Update VOP3b to not include obsolete operandsMatt Arsenault2014-09-303-15/+16
| | | | | | abs / neg are now part of the srcN_modifiers operands llvm-svn: 218691
* [PECOFF] Fix __imp_ prefix on x64.Rui Ueyama2014-09-302-12/+20
| | | | | | | | "__imp_" prefix always starts with double underscores. When I was writing the original code I misunderstood that it's "_imp_" on x64. llvm-svn: 218690
* clang-format: [JS] Support AllowShortFunctionsOnASingleLine.Daniel Jasper2014-09-303-3/+73
| | | | | | | | | Specifically, this also counts for stuff like (with style "inline"): var x = function() { return 1; }; llvm-svn: 218689
* CUDA: mark the target of implicit intrinsics properlyEli Bendersky2014-09-302-0/+16
| | | | | | | | | | | | | r218624 implemented target inference for implicit special members. However, other entities can be implicit - for example intrinsics. These can not have inference running on them, so they should be marked host device as before. This is the safest and most flexible setting, since by construction these functions don't invoke anything, and we'd like them to be invokable from both host and device code. LLVM's intrinsics definitions (where these intrinsics come from in the case of CUDA/NVPTX) have no notion of target, so both host and device intrinsics can be supported this way. llvm-svn: 218688
* Add SBThreadPlan to this CMakeLists.txt as well.Jim Ingham2014-09-301-0/+1
| | | | llvm-svn: 218687
* thread state coordinator: add additional assert missing from previous test ↵Todd Fiala2014-09-301-0/+1
| | | | | | check-in. llvm-svn: 218686
* Fix FreeBSD build.Zachary Turner2014-09-301-6/+6
| | | | llvm-svn: 218685
* Fixup some minor issues with HostProcess.Zachary Turner2014-09-303-0/+12
| | | | llvm-svn: 218684
* thread state coordinator: add test to be explicit about resume behavior in ↵Todd Fiala2014-09-303-9/+137
| | | | | | | | | | | | | | | | | | | | presence of deferred stop notification still pending. There is a state transition that seems potentially buggy that I am capturing and logging here, and including an explicit test to demonstrate expected behavior. See new test for detailed description. Added logging around this area since, if we hit it, we may have a usage bug, or a new state transition we really need to investigate. This is around this scenario: Thread C deferred stop notification awaiting thread A and thread B to stop. Thread A stops. Thread A requests resume. Thread B stops. Here we will explicitly signal the deferred stop notification after thread B stops even though thread A is now resumed. Copious logging happens here. llvm-svn: 218683
* Extend C disassembler API to allow specifying target featuresBradley Smith2014-09-304-26/+66
| | | | llvm-svn: 218682
* Add numeric extend, trunctate to mips fast-iselReed Kotler2014-09-303-5/+368
| | | | | | | | | | | | | | | | | | | | | Summary: Add numeric extend, trunctate to mips fast-isel Reactivates D4827 Test Plan: fpext.ll loadstoreconv.ll Reviewers: dsanders Subscribers: mcrosier Differential Revision: http://reviews.llvm.org/D5251 llvm-svn: 218681
* [AArch64] Remove unnecessary whitespace. (Test commit)Tom Coxon2014-09-301-2/+2
| | | | llvm-svn: 218680
* Fix cmake build for new thread plan files.Todd Fiala2014-09-302-0/+2
| | | | llvm-svn: 218679
* [DAG] Check in advance if a build_vector has a legal type before attempting ↵Andrea Di Biagio2014-09-301-4/+4
| | | | | | | | | | | | | | to convert it into a shuffle. Currently, the DAG Combiner only tries to convert type-legal build_vector nodes into shuffles. This patch simply moves the logic that checks if a build_vector has a legal value type up before we even start analyzing the operands. This allows to early exit immediately from method 'visitBUILD_VECTOR' if the node type is known to be illegal. No functional change intended. llvm-svn: 218677
* Revert r218673 'llvm-cov: add test for report's function & file association.'Alex Lorenz2014-09-304-32/+0
| | | | | | Test causes buildbot failures. llvm-svn: 218676
* [UBsan] Disable summary.cpp on Darwin. The test requires ubsan-asan, which ↵Alexander Potapenko2014-09-301-0/+2
| | | | | | does not work yet. llvm-svn: 218675
* [asan] XFAIL one test on Android.Evgeniy Stepanov2014-09-301-0/+3
| | | | | | And add a missing return in main, just in case. llvm-svn: 218674
OpenPOWER on IntegriCloud