summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [Statepoints] Let patchable statepoints have a symbolic call target.Sanjoy Das2015-07-283-14/+19
| | | | | | | | | | | | | | | | | | | | Summary: As added initially, statepoints required their call targets to be a constant pointer null if ``numPatchBytes`` was non-zero. This turns out to be a problem ergonomically, since there is no way to mark patchable statepoints as calling a (readable) symbolic value. This change remove the restriction of requiring ``null`` call targets for patchable statepoints, and changes PlaceSafepoints to maintain the symbolic call target through its transformation. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11550 llvm-svn: 243502
* Fix broken ArrayRef conversion from r243497.Alex Lorenz2015-07-281-1/+1
| | | | llvm-svn: 243501
* ignore duplicate divisor uses when transforming into reciprocal multiplies ↵Sanjay Patel2015-07-281-4/+4
| | | | | | | | | | | | | | | | | | | (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 llvm-svn: 243500
* fix TLI's combineRepeatedFPDivisors interface to return the minimum user ↵Sanjay Patel2015-07-287-14/+20
| | | | | | | | | | | | | | | threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498
* MIR Serialization: Serialize the target index machine operands.Alex Lorenz2015-07-286-0/+88
| | | | | Reviewers: Duncan P. N. Exon Smith llvm-svn: 243497
* [ARM] Define subtarget feature strict-align.Akira Hatanaka2015-07-283-55/+9
| | | | | | | | | | | | | | This commit defines subtarget feature strict-align and uses it instead of cl::opt -arm-strict-align to decide whether strict alignment should be forced. Also, remove the logic that was checking the OS and architecture as clang is now responsible for setting strict-align based on the command line options specified and the target architecute and OS. rdar://problem/21529937 http://reviews.llvm.org/D11470 llvm-svn: 243493
* AArch64: be careful of large immediates when optimising cmps.Tim Northover2015-07-281-5/+12
| | | | llvm-svn: 243492
* [PeepholeOptimizer] Look through PHIs to find additional register sourcesBruno Cardoso Lopes2015-07-282-83/+287
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486
* [mips][FastISel] Fix call lowering by bailing out on "fastcc" calls.Vasileios Kalintiris2015-07-281-0/+9
| | | | | | | | | | | | | | | Summary: Currently, we support only the MIPS O32 ABI calling convention for call lowering. With this change we avoid using the O32 calling convetion for lowering calls marked as using the fast calling convention. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11515 llvm-svn: 243485
* [Unroll] Add debug dumps to loop-unroll analyzer.Michael Zolotukhin2015-07-281-2/+21
| | | | llvm-svn: 243471
* [mips][FastISel] Fix generated code for IR's select instruction.Vasileios Kalintiris2015-07-281-1/+8
| | | | | | | | | | | | | | | Summary: Generate correct code for the select instruction by zero-extending it's boolean/condition operand to GPR-width. This is necessary because the conditional-move instructions operate on the whole register. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11506 llvm-svn: 243469
* [Unroll] Don't analyze blocks outside the loop.Michael Zolotukhin2015-07-281-4/+8
| | | | llvm-svn: 243466
* AMDGPU: Don't try to use LDS/vector for private if pointer value storedMatt Arsenault2015-07-281-4/+14
| | | | | | | If the pointer is the store's value operand, this would produce a broken module. Make sure the use is actually for the pointer operand. llvm-svn: 243462
* AMDGPU: Fix crash if called function is a bitcastMatt Arsenault2015-07-281-1/+6
| | | | | | | getCalledFunction() is null, so this would crash. Replace crash with an error on unsupported call. llvm-svn: 243461
* [SCEV] Apply NSW and NUW flags via poison value analysisJingyue Wu2015-07-282-25/+268
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial. This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this: for (int i = 0; i < limit; ++i) { sum += ptr[i + offset]; } Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There are more details in this discussion on llvmdev. June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234 July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392 Patch by Bjarke Roune Reviewers: eliben, atrick, sanjoy Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits Differential Revision: http://reviews.llvm.org/D11212 llvm-svn: 243460
* AMDGPU: Fix return type of getImplicitParameterOffset.Matt Arsenault2015-07-281-1/+1
| | | | | | Patch by Zoltan Gilian <zoltan.gilian@gmail.com> llvm-svn: 243459
* [RuntimeDyld] Make LoadedObjectInfo::getLoadedSectionAddress take a SectionRefLang Hames2015-07-286-42/+47
| | | | | | rather than a string section name. llvm-svn: 243456
* MIR Serialization: Serialize the block address machine operands.Alex Lorenz2015-07-284-4/+116
| | | | llvm-svn: 243453
* WebAssembly: MCAsmInfo only has one syntax variant for now.JF Bastien2015-07-281-5/+3
| | | | | | | | | | Summary: MCAsmInfo is set up with the default AssemblerDialect, which is zero. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11567 llvm-svn: 243452
* MIR Parser: Extract the method 'parseGlobalValue'. NFC.Alex Lorenz2015-07-281-9/+16
| | | | | | | | | This commit extracts the code that parses a global value from the method 'parseGlobalAddressOperand' into a new method 'parseGlobalValue', so that this code can be reused by the method which will parse the block address machine operands. llvm-svn: 243450
* MIR Parser: Move the function 'lexName'. NFC.Alex Lorenz2015-07-281-20/+20
| | | | | | | This commit moves the function 'lexName' to the start of the file so it can be reused by the function which will lex the named LLVM IR block references. llvm-svn: 243449
* MIR Printer: Remove an outdated TODO comment and assertion. NFC.Alex Lorenz2015-07-281-8/+0
| | | | | | | | | | | | | | | This commit removes an outdated TODO comment and a corresponding assertion which asserts that the mir printer can't the print machine basic blocks that aren't sequentially numbered. This comment and assertion were correct when I was working on the patch which serialized the machine basic blocks, but then I decided to add an 'ID' attribute to the machine basic block's YAML mapping based on the patch review. This comment and assertion then became invalid as with the 'ID' attribute we can serialize the non sequential machine basic blocks and their references without any problems. llvm-svn: 243447
* MIR Parser: Remove redundant parameters. NFC.Alex Lorenz2015-07-281-6/+6
| | | | | | | | | This commit removes the redundant parameters from the two methods 'initializeRegisterInfo' and 'initializeFrameInfo'. The removed parameters are redundant as we are already passing in the 'MachineFunction' to those methods, and those parameters can be derived from the machine function parameter. llvm-svn: 243445
* Implement target independent TLS compatible with glibc's emutls.c.Chih-Hung Hsieh2015-07-2810-29/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438
* Summary:Martell Malone2015-07-282-0/+5
| | | | | | | | | | | | | | | | Object: add IMAGE_FILE_MACHINE_ARM64 The official specifications state that the value of IMAGE_FILE_MACHINE_ARM64 is 0xAA64 (as per the Microsoft Portable Executable and Common Object Format Specification v8.3). Reviewers: rnk Subscribers: llvm-commits, compnerd, ruiu Differential Revision: http://reviews.llvm.org/D11511 llvm-svn: 243434
* [LVI] Cleanup whitespaces. NFCBruno Cardoso Lopes2015-07-281-61/+61
| | | | llvm-svn: 243430
* fix formatting; NFCSanjay Patel2015-07-281-2/+1
| | | | llvm-svn: 243424
* [AArch64] Match float round and convert to int instructions.Geoff Berry2015-07-281-12/+116
| | | | | | | | | | | | | | Summary: Add patterns for doing floating point round with various rounding modes followed by conversion to int as a single FCVT* instruction. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11424 llvm-svn: 243422
* [LAA] Add clarifying comments for the checking pointer grouping algorithm. NFCSilviu Baranga2015-07-281-1/+24
| | | | llvm-svn: 243416
* Implement __builtin_thread_pointerAdhemerval Zanella2015-07-282-0/+19
| | | | | | | This path add the aarch64 lowering of __builtin_thread_pointer. It uses the already implemented AArch64ISD::THREAD_POINTER used in TLS generation. llvm-svn: 243412
* [GMR] Teach GlobalsModRef to distinguish an important and safe case ofChandler Carruth2015-07-281-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | no-alias with non-addr-taken globals: they cannot alias a captured pointer. If the non-global underlying object would have been a capture were it to alias the global, we can firmly conclude no-alias. It isn't reasonable for a transformation to introduce a capture in a way observable by an alias analysis. Consider, even if it were to temporarily capture one globals address into another global and then restore the other global afterward, there would be no way for the load in the alias query to observe that capture event correctly. If it observes it then the temporary capturing would have changed the meaning of the program, making it an invalid transformation. Even instrumentation passes or a pass which is synthesizing stores to global variables to expose race conditions in programs could not trigger this unless it queried the alias analysis infrastructure mid-transform, in which case it seems reasonable to return results from before the transform started. See the comments in the change for a more detailed outlining of the theory here. This should address the primary performance regression found when the non-conservatively-correct path of the alias query was disabled. Differential Revision: http://reviews.llvm.org/D11410 llvm-svn: 243405
* [X86] Remove mergeSPUpdatesUp()Michael Kuperstein2015-07-281-25/+1
| | | | | | | | | | | X86FrameLowering has both a mergeSPUpdates() that accepts a direction, and an mergeSPUpdatesUp(), which seem to do the same thing, except for a slightly different interface. Removed the less general function. NFC. Differential Revision: http://reviews.llvm.org/D11510 llvm-svn: 243396
* [X86][SSE] Use bitmasks instead of shuffles where possible.Simon Pilgrim2015-07-281-0/+8
| | | | | | | | | | VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions. Split off from D11518. Differential Revision: http://reviews.llvm.org/D11541 llvm-svn: 243395
* AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructionsIgor Breger2015-07-283-1/+8
| | | | | | | | Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11528 llvm-svn: 243390
* Changes for MachineBasicBlock to use SortedVector for LiveIns.Puyan Lotfi2015-07-281-5/+0
| | | | llvm-svn: 243389
* Move the Target way of overriding DAG Scheduler to a target hookMehdi Amini2015-07-281-8/+6
| | | | | | | | | | | | | | | Summary: The previous way of overriding it was relying on calling "setDefault" on the global registry, which implies global mutable state. Reviewers: echristo, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11538 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243388
* [GMR] Fix a long-standing bug in GlobalsModRef where it failed to clearChandler Carruth2015-07-281-4/+30
| | | | | | | | | | | | | | | | | | | | | | | | out the per-function modref data structures when functions were deleted or when globals were deleted. I don't actually know how the global deletion side of this bug hasn't been hit before, but for the other it just-so-happens that functions aren't likely to be deleted in the particular part of the LTO pipeline where we currently enable GMR, so we got lucky. With this patch, I can self-host with GMR enabled in the normal pass pipeline! I was a bit concerned about the compile-time impact of this chang, which is part of what motivated my prior string of patches to make the per-function datastructure very dense and fast to walk. With those changes in place, I can't measure a significant compile time difference (the difference is around 0.1% which is *way* below the noise) before and after this patch when building a linked bitcode for all of Clang. Differential Revision: http://reviews.llvm.org/D11453 llvm-svn: 243385
* [LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFCAdam Nemet2015-07-282-8/+11
| | | | | | | | | | | | | | | | | | Before the patch, the checks were generated internally in addRuntimeCheck. Now, we use the new overloaded version of addRuntimeCheck that takes the ready-made set of checks as a parameter. The checks are now generated by the client (LoopDistribution) with the new RuntimePointerChecking::generateChecks API. Also the new printChecks API is used to print out the checks for debugging. This is to continue the transition over to the new model whereby clients will get the full set of checks from LAA, filter it and then pass it to LoopVersioning and in turn to addRuntimeCheck. llvm-svn: 243382
* Remove unnecessary const_casts. NFCCraig Topper2015-07-281-6/+4
| | | | llvm-svn: 243380
* Reserve some constant values for the Swift calling convention.Bob Wilson2015-07-282-0/+4
| | | | | | | | | | | Swift has a custom calling convention that also requires some new flags on arguments and one new attribute on alloca instructions. This patch does not include the implementation of that calling convention - that will be provided as part of the open-source release of Swift; this only reserves the bitcode constant values so that they are not used for other purposes. llvm-svn: 243379
* [libFuzzer] ensure that the dfsan tracing hooks actually run (using ↵Kostya Serebryany2015-07-282-1/+5
| | | | | | -verbosity=3 in tests) llvm-svn: 243365
* [libFuzzer] when using cmp traces, first check that the CMP is evaluated to ↵Kostya Serebryany2015-07-281-4/+44
| | | | | | one value much more frequently than to the other value (heuristic) llvm-svn: 243363
* fix invalid load folding with SSE/AVX FP logical instructions (PR22371)Sanjay Patel2015-07-283-46/+59
| | | | | | | | | | | | | | | | | | This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ). I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy when targeting 32-bit x86 and causes a miscompile/crash in Wine: https://bugs.winehq.org/show_bug.cgi?id=38826 https://llvm.org/bugs/show_bug.cgi?id=22371#c25 The quick fix is to simply remove the scalar FP logical instructions from the load folding table in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs, fneg, fcopysign. So the majority of this patch is altering those lowerings to use *vector* FP logical instructions (because that's all x86 gives us anyway). That lets us do the load folding legally. Differential Revision: http://reviews.llvm.org/D11477 llvm-svn: 243361
* [opaque pointer type] Avoid using pointee types to retrieve InlineAsm's ↵David Blaikie2015-07-282-19/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | function type As a stop-gap, retrieving the InlineAsm's function type was done via the pointee type of its (pointer) Value type. Instead, pass down and store the FunctionType in the InlineAsm object. The only wrinkle with this is the ConstantUniqueMap, which then needs to ferry the FunctionType down through the InlineAsmKeyType. This could be done a bit differently if the ConstantInfo trait were broadened a bit to provide an extension point for access to the TypeClass object from the ValType objects, so that the ConstantUniqueMap<InlineAsm> would then be keyed on FunctionTypes instead of PointerTypes that point to FunctionTypes. This drops the number of IR tests that don't roundtrip through bitcode* without calling PointerType::getElementType from 416 to 8 (out of 10733). 3 of those crash when roundtripping at ToT anyway. * modulo various unavoidable uses of pointer types when validating IR (for now) and in the way globals are parsed, unfortunately. These cases will either go away (because such validation will no longer be necessary or possible when pointee types are opaque), or have to be made simultaneously with the removal of pointee types. llvm-svn: 243356
* [LAA] Split out a helper to print a collection of memchecksAdam Nemet2015-07-271-34/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is effectively an NFC but we can no longer print the index of the pointer group so instead I print its address. This still lets us cross-check the section that list the checks against the section that list the groups (see how I modified the test). E.g. before we printed this: Run-time memory checks: Check 0: Comparing group 0: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc Against group 1: %arrayidxA = getelementptr i16, i16* %a, i64 %ind %arrayidxA1 = getelementptr i16, i16* %a, i64 %add ... Grouped accesses: Group 0: (Low: %c High: (78 + %c)) Member: {%c,+,4}<%for.body> Member: {(2 + %c),+,4}<%for.body> Now we print this (changes are underlined): Run-time memory checks: Check 0: Comparing group (0x7f9c6040c320): ~~~~~~~~~~~~~~ %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind Against group (0x7f9c6040c358): ~~~~~~~~~~~~~~ %arrayidxA1 = getelementptr i16, i16* %a, i64 %add %arrayidxA = getelementptr i16, i16* %a, i64 %ind ... Grouped accesses: Group 0x7f9c6040c320: ~~~~~~~~~~~~~~ (Low: %c High: (78 + %c)) Member: {(2 + %c),+,4}<%for.body> Member: {%c,+,4}<%for.body> llvm-svn: 243354
* [opaque pointers] Avoid the use of pointee types when parsing inline asm in IRDavid Blaikie2015-07-272-7/+11
| | | | | | | | | | | | | | When parsing calls to inline asm the pointee type (of the pointer type representing the value type of the InlineAsm value) was used. To avoid using it, use the ValID structure to ferry the FunctionType directly through to the InlineAsm construction. This is a bit of a workaround - alternatively the inline asm could explicitly describe the type but that'd be verbose/redundant in the IR and so long as the inline asm calls directly in the context of a call or invoke, this should suffice. llvm-svn: 243349
* [LSR] Generate and use zero extendsSanjoy Das2015-07-271-21/+139
| | | | | | | | | | | | | | | | | Summary: If a scale or a base register can be rewritten as "Zext({A,+,1})" then LSR will now consider a formula of that form in its normal cost computation. Depends on D9180 Reviewers: qcolombet, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9181 llvm-svn: 243348
* [TargetTransformInfo][NFCI] Add TargetTransformInfo::isZExtFree.Sanjoy Das2015-07-271-0/+4
| | | | | | | | | | | | | | Summary: This function is not used in this change but will be used in a subsequent change. Reviewers: mcrosier, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9180 llvm-svn: 243347
* WebAssembly: add a generic CPUJF Bastien2015-07-271-0/+3
| | | | | | | | | | Summary: WebAssemblySubtarget.cpp expects a default 'generic' CPU to exist, and this seems to be prevalent with other targets. It makes sense to have something between MVP and bleeding-edge, even though for now it's the same as MVP. This removes a warning that's currently generated. Subscribers: jfb, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D11546 llvm-svn: 243345
* MIR Serialization: Serialize the unnamed basic block references.Alex Lorenz2015-07-276-7/+93
| | | | | | | | | | | | This commit serializes the references from the machine basic blocks to the unnamed basic blocks. This commit adds a new attribute to the machine basic block's YAML mapping called 'ir-block'. This attribute contains the actual reference to the basic block. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243340
OpenPOWER on IntegriCloud