summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* [gvn] Fix more indenting and formatting in regions of code that willChandler Carruth2016-03-101-64/+62
| | | | | | | | | | | need to be changed for porting to the new pass manager. Also sink the comment on the ValueTable class back to that class instead of it dangling on an anonymous namespace. No functionality changed. llvm-svn: 263084
* [gvn] Reformat a chunk of the GVN code that is strangely indented priorChandler Carruth2016-03-101-241/+240
| | | | | | | | to restructuring it for porting to the new pass manager. No functionality changed. llvm-svn: 263083
* [PM] Port memdep to the new pass manager.Chandler Carruth2016-03-1015-140/+177
| | | | | | | | | | | | | | | | | | | | | | | This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082
* [BasicAA/MDA] Sink aliasing rules for malloc and calloc into BasicAAPhilip Reames2016-03-092-16/+17
| | | | | | | | | | MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA. In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice. Differential Revision: http://reviews.llvm.org/D15912 llvm-svn: 263075
* [CGP] Duplicate addressing computation in cold paths if required to sink ↵Philip Reames2016-03-092-8/+241
| | | | | | | | | | | | | | addressing mode This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store. In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path. This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had. Differential Revision: http://reviews.llvm.org/D17652 llvm-svn: 263074
* Fix the buildPhilip Reames2016-03-091-0/+1
| | | | | | I screwed up rebasing 263072. This change fixes the build and passes all make check. llvm-svn: 263073
* [LICM] Store promotion when memory is thread localPhilip Reames2016-03-093-13/+191
| | | | | | | | | | | | This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop. Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped. In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem. Differential Revision: http://reviews.llvm.org/D16783 llvm-svn: 263072
* [x86] fix cost model inaccuracy for vector memory opsSanjay Patel2016-03-092-6/+6
| | | | | | | | | | | The irony of this patch is that one CPU that is affected is AMD Jaguar, and Jaguar has a completely double-pumped AVX implementation. But getting the cost model to reflect that is a much bigger problem. The small goal here is simply to improve on the lie that !AVX2 == SandyBridge. Differential Revision: http://reviews.llvm.org/D18000 llvm-svn: 263069
* [WebAssembly] Update known gcc test failuresDerek Schuff2016-03-091-3/+0
| | | | llvm-svn: 263068
* [x86, AVX] optimize masked loads with constant masksSanjay Patel2016-03-092-25/+121
| | | | | | | | Instead of a variable-blend instruction, form a blend with immediate because those are always cheaper. Differential Revision: http://reviews.llvm.org/D17899 llvm-svn: 263067
* [ValueTracking] Extract isKnownPositive [NFCI]Philip Reames2016-03-093-2/+21
| | | | | | Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem. llvm-svn: 263062
* [InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)Philip Reames2016-03-092-0/+47
| | | | | | | | When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min. Differential Revision: http://reviews.llvm.org/D17873 llvm-svn: 263059
* [LLE] Add missing check for unit strideAdam Nemet2016-03-093-6/+58
| | | | | | | | | | I somehow missed this. The case in GCC (global_alloc) was similar to the new testcase except it had an array of structs rather than a two dimensional array. Fixes RP26885. llvm-svn: 263058
* [AArch64] Minor reformatting (NFC).Evandro Menezes2016-03-091-8/+6
| | | | llvm-svn: 263054
* [llvm-readobj] Enable GNU style section group printHemant Kulkarni2016-03-092-34/+100
| | | | | | Differential Revision: http://reviews.llvm.org/D17822 llvm-svn: 263050
* InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2Matthias Braun2016-03-098-40/+66
| | | | | | | | | | | | | | | | | | As part of r251146 InstCombine was extended to call computeKnownBits on every value in the function to determine whether it happens to be constant. This increases typical compiletime by 1-3% (5% in irgen+opt time) in my measurements. On the other hand this case did not trigger once in the whole llvm-testsuite. This patch introduces the notion of ExpensiveCombines which are only enabled for OptLevel > 2. I removed the check in InstructionSimplify as that is called from various places where the OptLevel is not known but given the rarity of the situation I think a check in InstCombine is enough. Differential Revision: http://reviews.llvm.org/D16835 llvm-svn: 263047
* MachineRegisterInfo: Correct commentMatthias Braun2016-03-091-2/+2
| | | | llvm-svn: 263046
* This change adds co-processor condition branching and conditional traps to ↵Chris Dewhurst2016-03-0913-50/+1079
| | | | | | | | | | | | | | | | the Sparc back-end. This will allow inline assembler code to utilize these features, but no automatic lowering is provided, except for the previously provided @llvm.trap, which lowers to "ta 5". The change also separates out the different assembly language syntaxes for V8 and V9 Sparc. Previously, only V9 Sparc assembly syntax was provided. The change also corrects the selection order of trap disassembly, allowing, e.g. "ta %g0 + 15" to be rendered, more readably, as "ta 15", ignoring the %g0 register. This is per the sparc v8 and v9 manuals. Check-in includes many extra unit tests to check this works correctly on both V8 and V9 Sparc processors. Code Reviewed at http://reviews.llvm.org/D17960. llvm-svn: 263044
* add a test RUN to show unexpected behaviorSanjay Patel2016-03-091-7/+10
| | | | llvm-svn: 263037
* [PPC] backend changes to generate xvabs[s,d]p and xvnabs[s,d]p instructionsKit Barton2016-03-092-0/+82
| | | | | | | This has to be committed before the FE changes Phabricator: http://reviews.llvm.org/D17837 llvm-svn: 263035
* Don't crash when compiling inline assembler containing .file directives.Adrian Prantl2016-03-092-3/+68
| | | | | | | | | | Removing the assertion is safe to do because any module level inline assembly is always emitted first via AsmPrinter::doInitialization(). http://reviews.llvm.org/D16101 rdar://22690666 llvm-svn: 263033
* [AArch64] Move helper functions into TII, so they can be reused elsewhere. NFC.Chad Rosier2016-03-093-47/+56
| | | | llvm-svn: 263032
* ReleaseNotes: update 'you may prefer' link to 3.8Hans Wennborg2016-03-091-1/+1
| | | | llvm-svn: 263030
* [AMDGPU] add AMDGPU target support to ELFObjectFile.h headerValery Pykhtin2016-03-095-0/+38
| | | | | | Differential Revision: http://reviews.llvm.org/D17144 llvm-svn: 263026
* [AArch64] Minor cleanup/remove redundant code. NFC.Chad Rosier2016-03-092-12/+8
| | | | llvm-svn: 263024
* SelectionDAG: Fix a crash on inline asm when output register supports ↵Tom Stellard2016-03-092-3/+19
| | | | | | | | | | | | | | | | multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022
* [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.Chad Rosier2016-03-0912-22/+23
| | | | | | http://reviews.llvm.org/D17967 llvm-svn: 263021
* Fix build error due to unsigned compare >= 0 in r263008 (NFC)Teresa Johnson2016-03-091-1/+1
| | | | | | | | | | | | Fixes error from building with clang: /usr/local/google/home/tejohnson/llvm/llvm_15/lib/Target/AMDGPU/InstPrinter/AMDGPUInstPrinter.cpp:407:12: error: comparison of unsigned expression >= 0 is always true [-Werror,-Wtautological-compare] if ((Imm >= 0x000) && (Imm <= 0x0ff)) { ~~~ ^ ~~~~~ llvm-svn: 263014
* Reland r262337 "calculate builtin_object_size if arg is a removable pointer"Petar Jovanovic2016-03-092-8/+59
| | | | | | | | | | | | | | | | | | Original commit message: calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 Reland the original change with a small modification (first do a null check and then do the cast) to satisfy ubsan. llvm-svn: 263011
* Update comments following the addition of PredicatedScalarEvolution. NFC.Silviu Baranga2016-03-092-9/+10
| | | | | | | | | | We changed several functions in LoopAccessAnalysis to use PSE instead of taking SE and a SCEV predicate as arguments, but didn't update the comments. This also fixes a comment in ScalarEvolution, where we refered to Preds when the argument name was A. llvm-svn: 263009
* [AMDGPU] Assembler: Support DPP instructions.Sam Kolton2016-03-0911-52/+499
| | | | | | | | | | | | | | | | | | | | Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008
* [AMDGPU] Assembler: Support abs() syntax.Nikolay Haustov2016-03-092-2/+63
| | | | | | | | | Support legacy SP3 abs(v1) syntax. InstPrinter still uses |v1|. Add tests. Differential Revision: http://reviews.llvm.org/D17887 llvm-svn: 263006
* [AMDGPU] Assembler: Fix s_setpc_b64Nikolay Haustov2016-03-092-3/+3
| | | | | | | | s_setpc_b64 has just one 64-bit source which is the address of instruction to jump to. Differential Revision: http://reviews.llvm.org/D17888 llvm-svn: 263005
* Fix uninitialized member bool. Detected by ASan.Richard Trieu2016-03-091-1/+1
| | | | llvm-svn: 262999
* [LoopDataPrefetch] Add stats and debug outputAdam Nemet2016-03-091-0/+9
| | | | llvm-svn: 262998
* [LAA] Improve comment for isStridedPtrAdam Nemet2016-03-091-2/+5
| | | | llvm-svn: 262997
* [WebAssembly] Update comments about irreducible control flow.Dan Gohman2016-03-092-8/+13
| | | | llvm-svn: 262995
* Use lto_bool_t instead of a raw `bool` (fixup for r262977).Sean Silva2016-03-091-1/+1
| | | | | | | Hopefully this should bring llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast back to life. llvm-svn: 262994
* Fix ThinLTO test: depends on the X86 backendMehdi Amini2016-03-093-0/+3
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262993
* void foo() is not a valid C prototype, one has to write void foo(void)Mehdi Amini2016-03-092-2/+2
| | | | | | | Remove a warning introduced in r262977 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262990
* Return StringRef instead of a naked char*; NFCSanjoy Das2016-03-091-2/+2
| | | | llvm-svn: 262989
* [IRCE] Reflow comments; NFCSanjoy Das2016-03-091-4/+2
| | | | llvm-svn: 262988
* Fix library dependency for llvm-lto after r262977Mehdi Amini2016-03-092-1/+2
| | | | | | | | It is a transitive dependency, so static build are OK but not build with individual DSO for each LLVM library. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262987
* [WebAssembly] Implement irreducible control flow.Dan Gohman2016-03-096-35/+385
| | | | | | | | This implements a very simple conservative transformation that doesn't require more than linear code size growth. There's room for much more optimization in this space. llvm-svn: 262982
* Fix GOLD plugin build after r262976Mehdi Amini2016-03-091-1/+1
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262981
* Remove trailing newline from test case; NFCSanjoy Das2016-03-091-1/+0
| | | | llvm-svn: 262980
* [SCEV] Slightly generalize getRangeViaFactoringSanjoy Das2016-03-092-13/+76
| | | | | | | | | Building on the previous change, this generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign extend or truncate operation, and k0 and k1 are constants. llvm-svn: 262979
* [SCEV] Slightly generalize getRangeViaFactoringSanjoy Das2016-03-092-25/+104
| | | | | | | | This change generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign extend or truncate operation. llvm-svn: 262978
* libLTO: add a ThinLTOCodeGenerator on the model of LTOCodeGenerator.Mehdi Amini2016-03-0913-3/+1444
| | | | | | | | | | | | | | | | | This is intended to provide a parallel (threaded) ThinLTO scheme for linker plugin use through the libLTO C API. The intent of this patch is to provide a first implementation as a proof-of-concept and allows linker to start supporting ThinLTO by definiing the libLTO C API. Some part of the libLTO API are left unimplemented yet. Following patches will add support for these. The current implementation can link all clang/llvm binaries. Differential Revision: http://reviews.llvm.org/D17066 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262977
* FunctionIndex is not optional for renameModuleForThinLTO(), make it a ↵Mehdi Amini2016-03-094-8/+8
| | | | | | | reference (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262976
OpenPOWER on IntegriCloud