summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][SSE] Keep 32-bit target i64 vector shifts on SSE unit.Simon Pilgrim2015-07-291-15/+31
| | | | | | | | This patch improves the 32-bit target i64 constant matching to detect the shuffle vector splats that are introduced by i64 vector shift vectorization (D8416). Differential Revision: http://reviews.llvm.org/D11327 llvm-svn: 243577
* AArch64: use 32-bit MOV rather than UBFX to truncate registers.Tim Northover2015-07-291-3/+3
| | | | | | | | | It's potentially more efficient on Cyclone, and from the optimization guides & schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd expect a MOV to be about the most efficient instruction with its semantics, even though the official "UXTW" alias is really a UBFX. llvm-svn: 243576
* MIR Serialization: Serialize the frame info's save and restore points.Alex Lorenz2015-07-292-8/+33
| | | | | | | This commit serializes the save and restore machine basic block references from the machine frame information class. llvm-svn: 243575
* MIR Parser: Extract the code that parses MBB references into a new method. NFC.Alex Lorenz2015-07-291-5/+18
| | | | | | | This commit extracts the code that's used by the class 'MIRParserImpl' to parse the machine basic block references into a new method named 'parseMBBReference'. llvm-svn: 243572
* [X86][SSE] Vectorize i64 ASHR operationsSimon Pilgrim2015-07-292-4/+18
| | | | | | | | This patch vectorizes the v2i64/v4i64 ASHR shift operations - the last remaining integer vector shifts that are still being transferred to/from the scalar unit to be completed. Differential Revision: http://reviews.llvm.org/D11439 llvm-svn: 243569
* Revert "Add reverse(ContainerTy) range adapter."Pete Cooper2015-07-291-1/+2
| | | | | | | | | This reverts commit r243563. The GCC buildbots were extremely unhappy about this. Reverting while we discuss a better way of doing overload resolution. llvm-svn: 243567
* [opaque pointers] Remove use of PointerType::getElementType in favor of ↵David Blaikie2015-07-291-4/+1
| | | | | | GEPOperator::getSourceElementType llvm-svn: 243566
* Add reverse(ContainerTy) range adapter.Pete Cooper2015-07-291-2/+1
| | | | | | | | | | | | | | | | | | | | | | | For cases where we needed a foreach loop in reverse over a container, we had to do something like for (const GlobalValue *GV : make_range(TypeInfos.rbegin(), TypeInfos.rend())) { This provides a convenience method which shortens this to for (const GlobalValue *GV : reverse(TypeInfos)) { There are 2 versions of this, with a preference to the rbegin() version. The first uses rbegin() and rend() to construct an iterator_range. The second constructs an iterator_range from the begin() and end() methods wrapped in std::reverse_iterator's. Reviewed by David Blaikie. llvm-svn: 243563
* [ASan] Disable dynamic alloca and UAR detection in presence of returns_twice ↵Alexey Samsonov2015-07-291-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | calls. Summary: returns_twice (most importantly, setjmp) functions are optimization-hostile: if local variable is promoted to register, and is changed between setjmp() and longjmp() calls, this update will be undone. This is the reason why "man setjmp" advises to mark all these locals as "volatile". This can not be enough for ASan, though: when it replaces static alloca with dynamic one, optionally called if UAR mode is enabled, it adds a whole lot of SSA values, and computations of local variable addresses, that can involve virtual registers, and cause unexpected behavior, when these registers are restored from buffer saved in setjmp. To fix this, just disable dynamic alloca and UAR tricks whenever we see a returns_twice call in the function. Reviewers: rnk Subscribers: llvm-commits, kcc Differential Revision: http://reviews.llvm.org/D11495 llvm-svn: 243561
* Roll forward r242871Jingyue Wu2015-07-292-19/+35
| | | | | | | r242871 missed one place that should be guarded with isPhysicalReg. This patch fixes that. llvm-svn: 243555
* MIR Serialization: Serialize the '.cfi_def_cfa' CFI instruction.Alex Lorenz2015-07-294-0/+18
| | | | llvm-svn: 243554
* MIR Parser: Parse multiple LHS register machine operands.Alex Lorenz2015-07-291-4/+7
| | | | llvm-svn: 243553
* move DAGCombiner's allowableAlignment() helper function into the TLISanjay Patel2015-07-293-64/+72
| | | | | | | | | | | | | | | | | | | | | | | Making allowableAlignment() more accessible was suggested as a predecessor patch for D10662, so I've pulled it into TargetLowering. This let's us remove 4 instances of duplicate logic in LegalizeDAG. There's a subtle functional change in the implementation: the existing allowableAlignment() code was using getPrefTypeAlignment() when checking alignment with the DataLayout and assumed that was fast. In this implementation, we use getABITypeAlignment() and assume that is fast. See the TODO comment or the discussion in the Phab review for future improvements in this implementation (don't use the data layout at all). There are no regression test changes from this difference, and I'm not sure how to expose it via a test. I think we actually do want to provide the 'Fast' param when checking this from DAGCombiner::MergeConsecutiveStores(). Ie, we shouldn't merge stores if the new stores are not going to be fast. But that change will require fixing allowsMisalignedMemoryAccess() overrides as noted in D10662. Differential Revision: http://reviews.llvm.org/D10905 llvm-svn: 243549
* [asan] Remove special case mapping on Android/AArch64.Evgeniy Stepanov2015-07-291-4/+4
| | | | | | | | | | | | | | ASan shadow on Android starts at address 0 for both historic and performance reasons. This is possible because the platform mandates -pie, which makes lower memory region always available. This is not such a good idea on 64-bit platforms because of MAP_32BIT incompatibility. This patch changes Android/AArch64 mapping to be the same as that of Linux/AAarch64. llvm-svn: 243548
* LowerBitSets: Add debugging output.Peter Collingbourne2015-07-291-0/+22
| | | | | | Differential Revision: http://reviews.llvm.org/D11583 llvm-svn: 243546
* [Unroll] Handle SwitchInst properly.Michael Zolotukhin2015-07-291-2/+2
| | | | | | Previously successor selection was simply wrong. llvm-svn: 243545
* [Unroll] Don't crash when simplified branch condition is undef.Michael Zolotukhin2015-07-291-4/+14
| | | | llvm-svn: 243544
* Revert "[PeepholeOptimizer] Look through PHIs to find additional register ↵Bruno Cardoso Lopes2015-07-292-287/+83
| | | | | | | | | | sources" Reported to Broke some internal tests: PR24303 This reverts commit r243486. llvm-svn: 243540
* Add an ArgList::AddAllArgs that accepts a vector of OptSpecifier.Douglas Katzman2015-07-291-0/+15
| | | | | | | | This lifts the somewhat arbitrary restriction on 3 OptSpecifiers. Differential Revision: http://reviews.llvm.org/D11597 llvm-svn: 243539
* AArch64: use AddressingModes.h accessors for compare shiftsTim Northover2015-07-291-4/+5
| | | | | | | No functional change because "lsl #12" is actually encoded as 12, but one less bug if someone ever decides to change that for the giggles. llvm-svn: 243536
* Reverting r243386 because it has serious post-commit concerns that have not ↵Aaron Ballman2015-07-291-0/+5
| | | | | | been addressed. Also reverts r243389, which relied on this commit. llvm-svn: 243527
* Temporarily revert r242871Jingyue Wu2015-07-292-28/+16
| | | | | | PR24299 llvm-svn: 243522
* [PPC] Fix PR24216: Don't generate splat for misaligned shuffle maskBill Schmidt2015-07-291-0/+5
| | | | | | | | | | | | | | | | Given certain shuffle-vector masks, LLVM emits splat instructions which splat the wrong bytes from the source register. The issue is that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp does not ensure that the splat pattern found is requesting bytes that are aligned on an EltSize boundary. This patch detects this situation as not a valid splat mask, resulting in a permute being generated instead of a splat. Patch and test case by Tyler Kenney, cleaned up a bit by me. This is a simple bug fix that would be good to incorporate into 3.7. llvm-svn: 243519
* [AArch64] Define subtarget feature strict-align.Akira Hatanaka2015-07-295-31/+25
| | | | | | | | | | This commit defines subtarget feature strict-align and uses it instead of cl::opt -aarch64-strict-align to decide whether strict alignment should be forced. rdar://problem/21529937 llvm-svn: 243516
* [Statepoints] Let patchable statepoints have a symbolic call target.Sanjoy Das2015-07-283-14/+19
| | | | | | | | | | | | | | | | | | | | Summary: As added initially, statepoints required their call targets to be a constant pointer null if ``numPatchBytes`` was non-zero. This turns out to be a problem ergonomically, since there is no way to mark patchable statepoints as calling a (readable) symbolic value. This change remove the restriction of requiring ``null`` call targets for patchable statepoints, and changes PlaceSafepoints to maintain the symbolic call target through its transformation. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11550 llvm-svn: 243502
* Fix broken ArrayRef conversion from r243497.Alex Lorenz2015-07-281-1/+1
| | | | llvm-svn: 243501
* ignore duplicate divisor uses when transforming into reciprocal multiplies ↵Sanjay Patel2015-07-281-4/+4
| | | | | | | | | | | | | | | | | | | (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 llvm-svn: 243500
* fix TLI's combineRepeatedFPDivisors interface to return the minimum user ↵Sanjay Patel2015-07-287-14/+20
| | | | | | | | | | | | | | | threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498
* MIR Serialization: Serialize the target index machine operands.Alex Lorenz2015-07-286-0/+88
| | | | | Reviewers: Duncan P. N. Exon Smith llvm-svn: 243497
* [ARM] Define subtarget feature strict-align.Akira Hatanaka2015-07-283-55/+9
| | | | | | | | | | | | | | This commit defines subtarget feature strict-align and uses it instead of cl::opt -arm-strict-align to decide whether strict alignment should be forced. Also, remove the logic that was checking the OS and architecture as clang is now responsible for setting strict-align based on the command line options specified and the target architecute and OS. rdar://problem/21529937 http://reviews.llvm.org/D11470 llvm-svn: 243493
* AArch64: be careful of large immediates when optimising cmps.Tim Northover2015-07-281-5/+12
| | | | llvm-svn: 243492
* [PeepholeOptimizer] Look through PHIs to find additional register sourcesBruno Cardoso Lopes2015-07-282-83/+287
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486
* [mips][FastISel] Fix call lowering by bailing out on "fastcc" calls.Vasileios Kalintiris2015-07-281-0/+9
| | | | | | | | | | | | | | | Summary: Currently, we support only the MIPS O32 ABI calling convention for call lowering. With this change we avoid using the O32 calling convetion for lowering calls marked as using the fast calling convention. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11515 llvm-svn: 243485
* [Unroll] Add debug dumps to loop-unroll analyzer.Michael Zolotukhin2015-07-281-2/+21
| | | | llvm-svn: 243471
* [mips][FastISel] Fix generated code for IR's select instruction.Vasileios Kalintiris2015-07-281-1/+8
| | | | | | | | | | | | | | | Summary: Generate correct code for the select instruction by zero-extending it's boolean/condition operand to GPR-width. This is necessary because the conditional-move instructions operate on the whole register. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11506 llvm-svn: 243469
* [Unroll] Don't analyze blocks outside the loop.Michael Zolotukhin2015-07-281-4/+8
| | | | llvm-svn: 243466
* AMDGPU: Don't try to use LDS/vector for private if pointer value storedMatt Arsenault2015-07-281-4/+14
| | | | | | | If the pointer is the store's value operand, this would produce a broken module. Make sure the use is actually for the pointer operand. llvm-svn: 243462
* AMDGPU: Fix crash if called function is a bitcastMatt Arsenault2015-07-281-1/+6
| | | | | | | getCalledFunction() is null, so this would crash. Replace crash with an error on unsupported call. llvm-svn: 243461
* [SCEV] Apply NSW and NUW flags via poison value analysisJingyue Wu2015-07-282-25/+268
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial. This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this: for (int i = 0; i < limit; ++i) { sum += ptr[i + offset]; } Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There are more details in this discussion on llvmdev. June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234 July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392 Patch by Bjarke Roune Reviewers: eliben, atrick, sanjoy Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits Differential Revision: http://reviews.llvm.org/D11212 llvm-svn: 243460
* AMDGPU: Fix return type of getImplicitParameterOffset.Matt Arsenault2015-07-281-1/+1
| | | | | | Patch by Zoltan Gilian <zoltan.gilian@gmail.com> llvm-svn: 243459
* [RuntimeDyld] Make LoadedObjectInfo::getLoadedSectionAddress take a SectionRefLang Hames2015-07-286-42/+47
| | | | | | rather than a string section name. llvm-svn: 243456
* MIR Serialization: Serialize the block address machine operands.Alex Lorenz2015-07-284-4/+116
| | | | llvm-svn: 243453
* WebAssembly: MCAsmInfo only has one syntax variant for now.JF Bastien2015-07-281-5/+3
| | | | | | | | | | Summary: MCAsmInfo is set up with the default AssemblerDialect, which is zero. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11567 llvm-svn: 243452
* MIR Parser: Extract the method 'parseGlobalValue'. NFC.Alex Lorenz2015-07-281-9/+16
| | | | | | | | | This commit extracts the code that parses a global value from the method 'parseGlobalAddressOperand' into a new method 'parseGlobalValue', so that this code can be reused by the method which will parse the block address machine operands. llvm-svn: 243450
* MIR Parser: Move the function 'lexName'. NFC.Alex Lorenz2015-07-281-20/+20
| | | | | | | This commit moves the function 'lexName' to the start of the file so it can be reused by the function which will lex the named LLVM IR block references. llvm-svn: 243449
* MIR Printer: Remove an outdated TODO comment and assertion. NFC.Alex Lorenz2015-07-281-8/+0
| | | | | | | | | | | | | | | This commit removes an outdated TODO comment and a corresponding assertion which asserts that the mir printer can't the print machine basic blocks that aren't sequentially numbered. This comment and assertion were correct when I was working on the patch which serialized the machine basic blocks, but then I decided to add an 'ID' attribute to the machine basic block's YAML mapping based on the patch review. This comment and assertion then became invalid as with the 'ID' attribute we can serialize the non sequential machine basic blocks and their references without any problems. llvm-svn: 243447
* MIR Parser: Remove redundant parameters. NFC.Alex Lorenz2015-07-281-6/+6
| | | | | | | | | This commit removes the redundant parameters from the two methods 'initializeRegisterInfo' and 'initializeFrameInfo'. The removed parameters are redundant as we are already passing in the 'MachineFunction' to those methods, and those parameters can be derived from the machine function parameter. llvm-svn: 243445
* Implement target independent TLS compatible with glibc's emutls.c.Chih-Hung Hsieh2015-07-2810-29/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438
* Summary:Martell Malone2015-07-282-0/+5
| | | | | | | | | | | | | | | | Object: add IMAGE_FILE_MACHINE_ARM64 The official specifications state that the value of IMAGE_FILE_MACHINE_ARM64 is 0xAA64 (as per the Microsoft Portable Executable and Common Object Format Specification v8.3). Reviewers: rnk Subscribers: llvm-commits, compnerd, ruiu Differential Revision: http://reviews.llvm.org/D11511 llvm-svn: 243434
* [LVI] Cleanup whitespaces. NFCBruno Cardoso Lopes2015-07-281-61/+61
| | | | llvm-svn: 243430
OpenPOWER on IntegriCloud