summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Use READ2/WRITE2 instructions for 64-bit mem ops with 32-bit alignmentTom Stellard2014-08-225-4/+116
| | | | llvm-svn: 216279
* R600/SI: Use a ComplexPattern for DS loads and storesTom Stellard2014-08-228-120/+169
| | | | llvm-svn: 216278
* R600/SI: Wrap local memory pointer in AssertZExt on SITom Stellard2014-08-221-0/+12
| | | | | | | | | These pointers are really just offsets and they will always be less than 16-bits. Using AssertZExt allows us to use computeKnownBits to prove that these values are positive. We will use this information in a later commit. llvm-svn: 216277
* R600/SI: Use correct helper class for DS_WRITE2 instructionsTom Stellard2014-08-221-1/+1
| | | | | | | DS_1A uses a single offset encoding, so offset1 wasn't being encoded. llvm-svn: 216276
* [ARM] Move the implementation of the target hooks related to copy-relatedQuentin Colombet2014-08-225-116/+118
| | | | | | | | | instruction from ARMInstrInfo to ARMBaseInstrInfo. That way, thumb mode can also benefit from the advanced copy optimization. <rdar://problem/12702965> llvm-svn: 216274
* InstCombine: Don't unconditionally preserve 'nuw' when shrinking constantsDavid Majnemer2014-08-222-6/+24
| | | | | | | | | | | | Consider: %add = add nuw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nuw' from the instruction. llvm-svn: 216273
* InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MINDavid Majnemer2014-08-224-1/+38
| | | | | | We can preserve nsw during this transform if -C won't overflow. llvm-svn: 216269
* [Support] Fix the overflow bug in ULEB128 decoding.Alex Lorenz2014-08-222-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D5029 llvm-svn: 216268
* [mips] Don't use odd-numbered float registers for double arguments for fastccSasa Stankovic2014-08-222-2/+88
| | | | | | | | calling convention if FP is 64-bit and +nooddspreg is used. Differential Revision: http://reviews.llvm.org/D4981.diff llvm-svn: 216262
* InstCombine: Don't unconditionally preserve 'nsw' when shrinking constantsDavid Majnemer2014-08-222-1/+21
| | | | | | | | | | | | | | Consider: %add = add nsw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nsw' from the instruction. This fixes PR20377. llvm-svn: 216261
* fix: SLPVectorizer crashes for unreachable blocks containing not schedulable ↵Erik Eckstein2014-08-222-0/+48
| | | | | | | | | | | | instructions. In unreachable blocks it's legal to have instructions like "%x = op %x". Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for unreachable blocks and ignore them. Fixes bug 20646. llvm-svn: 216256
* [dfsan] Fix non-determinism bug in non-zero label check annotator.Peter Collingbourne2014-08-222-15/+16
| | | | | | | We now use a std::vector instead of a DenseSet to store the list of label checks so that we can iterate over it deterministically. llvm-svn: 216255
* ValueTracking: Figure out more bits when looking at add/subDavid Majnemer2014-08-222-66/+51
| | | | | | | | | Given something like X01XX + X01XX, we know that the result must look like X1XXX. Adapted from a patch by Richard Smith, test-case written by me. llvm-svn: 216250
* SROA: Handle a case of store size being smaller than allocation sizeReid Kleckner2014-08-222-5/+54
| | | | | | | | | | | | | | | | In this case, we are creating an x86_fp80 slice for a union from C where the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes, and that's just fine. We can't, however, use regular loads and stores to access the slice, because the store size is only 10 bytes / 80 bits. Instead, use memcpy and memset. Fixes PR18726. Reviewed By: chandlerc Differential Revision: http://reviews.llvm.org/D5012 llvm-svn: 216248
* Revert "X86: Align the stack on word boundaries in LowerFormalArguments()"Duncan P. N. Exon Smith2014-08-212-8/+2
| | | | | | | | | | | | | This (mostly) reverts commit r216119. Somewhere during the review Reid committed r214980 which fixed this another way, and I neglected to check that the testcase still failed before committing. I've left test/CodeGen/X86/aligned-variadic.ll around in case it adds extra coverage. llvm-svn: 216246
* Add an explicit move constructor to SrcBufferReid Kleckner2014-08-211-0/+5
| | | | | | | MSVC can't synthesize the explicit one. Instead it tries to emit a copy ctor which would call the deleted copy ctor of unique_ptr. llvm-svn: 216244
* [FastISel][AArch64] Add support for variable shift.Juergen Ributzka2014-08-212-44/+253
| | | | | | | | This adds the missing variable shift support for value type i8, i16, and i32. This fixes <rdar://problem/18095685>. llvm-svn: 216242
* Minor refactor to make applying patches from 'Add a "probe-stack" attribute' ↵Philip Reames2014-08-211-1/+5
| | | | | | review thread out of order easier. llvm-svn: 216241
* Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator ↵David Blaikie2014-08-2116-52/+47
| | | | | | | | | | | | | | | changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks. Somewhat unnoticed in the original implementation of discriminators, but it could cause instructions to end up in new, small, DW_TAG_lexical_blocks due to the use of DILexicalBlock to track discriminator changes. Instead, use DILexicalBlockFile which we already use to track file changes without introducing new scopes, so it works well to track discriminator changes in the same way. llvm-svn: 216239
* name change: isPow2DivCheap -> isPow2SDivCheapSanjay Patel2014-08-215-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | isPow2DivCheap That name doesn't specify signed or unsigned. Lazy as I am, I eventually read the function and variable comments. It turns out that this is strictly about signed div. But I discovered that the comments are wrong: srl/add/sra is not the general sequence for signed integer division by power-of-2. We need one more 'sra': sra/srl/add/sra That's the sequence produced in DAGCombiner. The first 'sra' may be removed when dividing by exactly '2', but that's a special case. This patch corrects the comments, changes the name of the flag bit, and changes the name of the accessor methods. No functional change intended. Differential Revision: http://reviews.llvm.org/D5010 llvm-svn: 216237
* [PeepholeOptimizer] Enable the advanced copy optimization by default.Quentin Colombet2014-08-211-1/+1
| | | | | | | | | | | | | The advanced copy optimization does not yield any difference on the whole llvm test-suite + SPECs, either in compile time or runtime (binaries are identical), but has a big potential when data go back and forth between register files as demonstrated with test/CodeGen/ARM/adv-copy-opt.ll. Note: This was measured for both Os and O3 for armv7s, arm64, and x86_64. <rdar://problem/12702965> llvm-svn: 216236
* Whitespace change to reduce diff in future patch.Philip Reames2014-08-211-6/+6
| | | | | | | | Patch 2 of 11 in 'Add a "probe-stack" attribute' review thread Patch by: john.kare.alsaker@gmail.com llvm-svn: 216235
* [X86] Split out the logic to select the stack probe function (NFC)Philip Reames2014-08-212-11/+25
| | | | | | | | Patch 1 of 11 in 'Add a "probe-stack" attribute' review thread. Patch by: <john.kare.alsaker@gmail.com> llvm-svn: 216233
* Add hooks for emitLeading/TrailingFenceRobin Morisset2014-08-211-0/+20
| | | | llvm-svn: 216232
* Rename AtomicExpandLoadLinked into AtomicExpandRobin Morisset2014-08-2117-40/+41
| | | | | | | | | | | AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of a group that aim at making it more target-independent. See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html for details The command line option is "atomic-expand" llvm-svn: 216231
* [PeepholeOptimizer] Update the kill flags when extending the live-range of theQuentin Colombet2014-08-211-1/+5
| | | | | | | | source of a copy. <rdar://problem/12702965> llvm-svn: 216229
* Fix a URL (NFC)Justin Bogner2014-08-211-1/+1
| | | | llvm-svn: 216228
* [FastISel][AArch64] Use the correct register class to make the MI verifier ↵Juergen Ributzka2014-08-2128-180/+184
| | | | | | | | | | | | | | | happy. This is mostly achieved by providing the correct register class manually, because getRegClassFor always returns the GPR*AllRegClass for MVT::i32 and MVT::i64. Also cleanup the code to use the FastEmitInst_* method whenever possible. This makes sure that the operands' register class is properly constrained. For all the remaining cases this adds the missing constrainOperandRegClass calls for each operand. llvm-svn: 216225
* Explicitly pass ownership of the MemoryBuffer to AddNewSourceBuffer using ↵David Blaikie2014-08-2112-56/+54
| | | | | | std::unique_ptr llvm-svn: 216223
* R600/SI: Teach moveToVALU how to handle more S_LOAD_* instructionsTom Stellard2014-08-213-9/+155
| | | | llvm-svn: 216220
* R600/SI: Make sure SCRATCH_WAVE_OFFSET is added as Live-In to the functionTom Stellard2014-08-213-12/+12
| | | | | | This fixes a crash in an ocl conformance test. llvm-svn: 216219
* R600/SI: Remove unused SGPR spilling codeTom Stellard2014-08-212-80/+0
| | | | llvm-svn: 216218
* R600/SI: Use eliminateFrameIndex() to expand SGPR spill pseudosTom Stellard2014-08-215-112/+159
| | | | | | | | | | | | | | | This will simplify the SGPR spilling and also allow us to use MachineFrameInfo for calculating offsets, which should be more reliable than our custom code. This fixes a crash in some cases where a register would be spilled in a branch such that the VGPR defined for spilling did not dominate all the uses when restoring. This fixes a crash in an ocl conformance test. The test requries register spilling and is too big to include. llvm-svn: 216217
* R600/SI: Handle VCC in SIRegisterInfo::getPhysRegSubReg()Tom Stellard2014-08-211-0/+11
| | | | | | | This fixes a crash in an ocl conformance test. The test requries register spilling and is too big to include. llvm-svn: 216216
* Rewrite the gold plugin to fix pr19901.Rafael Espindola2014-08-219-154/+446
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a fundamental difference between how the gold API and lib/LTO view the LTO process. The gold API talks about a particular symbol in a particular file. The lib/LTO API talks about a symbol in the merged module. The merged module is then defined in terms of the IR semantics. In particular, a linkonce_odr GV is only copied if it is used, since it is valid to drop unused linkonce_odr GVs. In the testcase in pr19901 both properties collide. What happens is that gold asks us to keep a particular linkonce_odr symbol, but the IR linker doesn't copy it to the merged module and we never have a chance to ask lib/LTO to keep it. This patch fixes it by having a more direct implementation of the gold API. If it asks us to keep a symbol, we change the linkage so it is not linkonce. If it says we can drop a symbol, we do so. All of this before we even send the module to lib/Linker. Since now we don't have to produce LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN, during symbol resolution we can use a temporary LLVMContext and do lazy module loading. This allows us to keep the minimum possible amount of allocated memory around. This should also allow as much parallelism as we want, since there is no shared context. llvm-svn: 216215
* Satiate the sanitizer build botJonathan Roelofs2014-08-211-1/+2
| | | | | | This fixes a missing initializer from r216182 llvm-svn: 216212
* Move some logic to populateLTOPassManager.Rafael Espindola2014-08-214-39/+58
| | | | | | | This will avoid code duplication in the next commit which calls it directly from the gold plugin. llvm-svn: 216211
* [AVX512] Add class to group common template arguments related to vector typeAdam Nemet2014-08-211-18/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | We discussed the issue of generality vs. readability of the AVX512 classes recently. I proposed this approach to try to hide and centralize the mappings we commonly perform based on the vector type. A new class X86VectorVTInfo captures these. The idea is to pass an instance of this class to classes/multiclasses instead of the corresponding ValueType. Then the class/multiclass can use its field for things that derive from the type rather than passing all those as separate arguments. I modified avx512_valign to demonstrate this new approach. As you can see instead of 7 related template parameters we now have one. The downside is that we have to refer to fields for the derived values. I named the argument '_' in order to make this as invisible as possible. Please let me know if you absolutely hate this. (Also once we allow local initializations in multiclasses we can recover the original version by assigning the fields to local variables.) Another possible use-case for this class is to directly map things, e.g.: RegisterClass KRC = X86VectorVTInfo<32, i16>.KRC llvm-svn: 216209
* Coverage Mapping: add function's hash to coverage function records.Alex Lorenz2014-08-212-6/+10
| | | | | | | | | | The profile data format was recently updated and the new indexing api requires the code coverage tool to know the function's hash as well as the function's name to get the execution counts for a function. Differential Revision: http://reviews.llvm.org/D4994 llvm-svn: 216207
* llvm-gcc is dead.Rafael Espindola2014-08-211-4/+3
| | | | llvm-svn: 216206
* [LIT] Remove documentation for method since it does not existEric Fiselier2014-08-211-8/+0
| | | | llvm-svn: 216204
* Respect LibraryInfo in populateLTOPassManager and use it. NFC.Rafael Espindola2014-08-212-3/+6
| | | | llvm-svn: 216203
* Remove dead code. NFC.Rafael Espindola2014-08-211-8/+0
| | | | llvm-svn: 216201
* [AArch64] Run a peephole pass right after AdvSIMD pass.Quentin Colombet2014-08-212-3/+28
| | | | | | | | | The AdvSIMD pass may produce copies that are not coalescer-friendly. The peephole optimizer knows how to fix that as demonstrated in the test case. <rdar://problem/12702965> llvm-svn: 216200
* [FastISel][AArch64] Factor out ANDWri instruction generation into a helper ↵Juergen Ributzka2014-08-211-42/+50
| | | | | | function. NFCI. llvm-svn: 216199
* Thumb1 load/store optimizer: Improve code to materialize new base register.Moritz Roth2014-08-214-7/+44
| | | | | | | | | | | | | There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only the latter supports using different source and destination registers, so whenever we materialize a new base register (at a certain offset) we'd do so by moving the base register value to the new register and then adding in place. This patch changes the code to use a single tADDi3 if the offset is small enough to fit in 3 bits. Differential Revision: http://reviews.llvm.org/D5006 llvm-svn: 216193
* Use returns_nonnull in BumpPtrAllocator and MallocAllocator to avoid ↵Hans Wennborg2014-08-212-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | null-check in placement new In both Clang and LLVM, this is a common pattern: Size = sizeof(DeclRefExpr) + SomeExtraStuff; void *Mem = Context.Allocate(Size, llvm::alignOf<DeclRefExpr>()); return new (Mem) DeclRefExpr(...); The annoying thing is that because the default placement-new operator has a nothrow specification, the compiler will insert a null check of Mem before calling the DeclRefExpr constructor. This null check is redundant for us, because we expect the allocation functions to never return null. By annotating the allocator functions with returns_nonnull, we can optimize away these checks. Compiling clang with a recent version of Clang and measuring with: $ perf stat -r20 bin/clang.patch -fsyntax-only -w gcc.c && perf stat -r20 bin/clang.orig -fsyntax-only -w gcc.c Shows a 2.4% speed-up (+- 0.8%). The pattern occurs in LLVM too. Measuring with -O3 (and now using bzip2.c instead, because it's smaller): $ perf stat -r20 bin/clang.patch -O3 -w bzip2.c && perf stat -r20 bin/clang.orig -O3 -w bzip2.c Shows 4.4 % speed-up (+- 1%). If anyone knows of a similar attribute we can use for MSVC, or some other technique to get rid off the null check there, please let me know. Differential Revision: http://reviews.llvm.org/D4989 llvm-svn: 216192
* [FastISel][AArch64] Remove redundant test.Juergen Ributzka2014-08-211-23/+0
| | | | | | These tests and many more are already covered by fast-isel-addressing-modes.ll. llvm-svn: 216186
* Add a thread-model knob for lowering atomics on baremetal & single threaded ↵Jonathan Roelofs2014-08-215-3/+146
| | | | | | | | systems http://reviews.llvm.org/D4984 llvm-svn: 216182
* Handle inlining in populateLTOPassManager like in populateModulePassManager.Rafael Espindola2014-08-215-9/+22
| | | | | | No functionality change. llvm-svn: 216178
OpenPOWER on IntegriCloud